Mastodon @Mastodon

calicoding @calicoding@mastodon.social

Ok I wasn’t sure at first, but it seems like the performance issue I’m facing is partially due to ref counting. It’s not free!

I’m working in a system I didn’t design, that relies heavily on reference types and inheritance. Given the chance, I probably would have designed it differently, leaning more on value types.

Dec 04, 2024, 07:34 PM··Ivory for iOS

0boosts·1favorite

**calicoding** @calicoding · Dec 6, 2024

Dec 6, 2024

calicoding @calicoding

LOL so yes ref counting was a bottle neck, but only because I’m an idiot and accidentally called my heavy quadratic calculation hundreds of times more than I needed to

I wish this was easier to see in the profiler. But also: check your assumptions!

**calicoding** @calicoding · Dec 7, 2024

Dec 7, 2024

calicoding @calicoding

Accidentally quadratic? Pffft that’s rookie numbers. This was accidentally quartic

**Matt Massicotte** @mattiem · Dec 6, 2024

Dec 6, 2024

Matt Massicotte @mattiem

@calicoding I saw your first post and was like “hmmmmm”, but this makes more sense

**Helge Heß** @helge · Dec 6, 2024

Dec 6, 2024

Helge Heß @helge

@mattiem @calicoding Ref counting actually is expensive because it is thread safe, it’s the global interpreter lock of Swift.
(and it doesn’t just affect user types, all the cow types use it)

**Matt Massicotte** @mattiem · Dec 6, 2024

Dec 6, 2024

Matt Massicotte @mattiem

@helge @calicoding Define expensive

**Helge Heß** @helge · Dec 6, 2024

Dec 6, 2024

Helge Heß @helge

@mattiem @calicoding Something requiring a lock of some sorts, resulting in cross core cache flushes and such (I don’t know the actual effects, probably depends a lot on the platform)

**Joe Groff** @joe@f.duriansoftware.com · Dec 6, 2024

Dec 6, 2024

Joe Groff @joe@f.duriansoftware.com

@helge @mattiem @calicoding on Apple Silicon, the expensiveness is at least only if you have actual contention accessing the refcount

**Helge Heß** @helge · Dec 6, 2024

Dec 6, 2024

Helge Heß @helge

@joe @mattiem @calicoding Even if there is some atomic instruction doing the thing, the cores would still have to synchronize ie flush their pipelines, no? I don’t know much about such low levels and some info why it isn’t expensive would be welcome
Or how expensive compared to a simple rc++ increment.
My assumption is that RC is massively more expensive, is that wrong?

**Joe Groff** @joe@f.duriansoftware.com · Dec 7, 2024 *

Dec 7, 2024 *

Joe Groff @joe@f.duriansoftware.com

@helge @mattiem @calicoding the particular instructions used for refcounting get speculatively executed as if they were nonatomic, so in the case where there's no contention, there's very little overhead (because the atomic codegen still needs more instructions than a nonatomic update, etc.) if it turns out later the memory location was contended then you throw all that work away and do it properly

**calicoding** @calicoding · Dec 7, 2024

Dec 7, 2024

calicoding @calicoding

@joe @helge @mattiem in my case, everything was run on the main thread, so I don’t think there was any contention.

Profiling in release mode made it harder to interpret the results in Time Profiler instrument. In this case it actually helped to profile in debug. Less crazy optimizations. But wow array indexing in debug mode seems like it has a ton going on

**Joe Groff** @joe@f.duriansoftware.com · Dec 7, 2024

Dec 7, 2024

Joe Groff @joe@f.duriansoftware.com

@calicoding @helge @mattiem yeah debug profiling probably isn't very representative of optimized codegen, because so much is left explicit from the standard library implementation

**Matt Massicotte** @mattiem · Dec 7, 2024

Dec 7, 2024

Matt Massicotte @mattiem

@joe @helge @calicoding even CPU instructions don’t want to work anymore this is getting ridiculous

**Helge Heß** @helge · Dec 7, 2024

Dec 7, 2024

Helge Heß @helge

@mattiem @joe @calicoding There is a reason why „lazy“ is a keyword in Swift

**Sven A. Schmidt** @finestructure · Dec 7, 2024

Dec 7, 2024

Sven A. Schmidt @finestructure

@mattiem already working from home what more do they want

**Joe Groff** @joe@f.duriansoftware.com · Dec 7, 2024

Dec 7, 2024

Joe Groff @joe@f.duriansoftware.com

@helge @mattiem @calicoding a good chunk (most?) of the remaining overhead in swift_retain/release is in the call and dyld thunk, to the point we've been contemplating horrible ways to avoid that without giving up ABI flexibility for the object header

**Saagar Jha** @saagar@saagarjha.com · Dec 18, 2024

Dec 18, 2024

Saagar Jha @saagar@saagarjha.com

@joe @helge @mattiem @calicoding (whispers quietly) JITs

Drag & drop to upload

Recent searches

Search options

Administered by:

Server stats:

Recent searches

Search options

Administered by:

Server stats:

Back