This is a weird (to me) way of presenting Zig syntax and operation. I would generally assume that someone coming to Zig would be familiar with C/C++ or Rust and only need to be able to map Zig syntax to things they already know.
If the author's content is all in this 'assume no C background' style, it will be useful to those without previous manual memory management background in contrast to most articles that assume a lot of context.
In C you can use `[0]` as a postfix dereferencing operation... ugly but it works. I use it a lot in GDB. But I agree that Pascal does it better here, the dereferencing operator should have been postfix all along.
The thing I really wish C had, and which there is no straightforward workaround for, is postfix casting. When you do `((struct foo*)(COMPLEX_EXPRESSION))->field`, I think it would read a lot better if the `(struct foo*)` cast was on the same side as `->field`. Maybe something like `(COMPLEX_EXPRESSION)@(struct foo*)->field`
`->` shouldn't be necessary to begin with... if the LHS is a pointer already, `.` could just auto-dereference it instead of generating an error like it does now.
Note that when you define smart pointer types C++ there is indeed a difference between `.` and `*` resp. `->`: the latter are typically overloaded to operate on the underlying pointer type whereas the former operates on the smart pointer class itself (e.g. calling `reset` on a `std::unique_ptr`). Not sure how you would do that with a single dereference operator.
It would have to have been designed to use external functions instead (eg spelled std::reset), which might even make the API more consistent since many accessor functions already require doing just that (eg std::begin)
Free functions like `std::begin` are meant for generic code. Apart from that, how would you even access the private members if the dot-operator was overloaded? (C++ does not support overloading the dot-operator for good reasons.)
I always wondered about that. Is there some technical reason that it was designed the way it was? Is it disambiguating something?
It seems like `.` could in theory be made to recursively dereference any number of layers down to the base type. It's not as though `.` has any other possible meaning when used directly against an address ... right?
Historical reasons: early C compilers didn't actually keep track of what the type of the LHS was, so they needed `.` vs `->` to properly disambiguate. Later C compilers did start tracking that, so the difference in syntax was no longer necessary, but stuck for reasons of history and backward compatibility
Besides the sibling comment, I believe many of the finer details of Chomsky's hierarchy wasn't understood at the time, language grammars were just sort of "home-grown".
In Pascal, it's a distinct operator from field access, though.
What this actually reminds me of is Ada, where you write `value.all` to basically the same effect as Zig's `value.*`. It's as if everything is a struct with itself as a member.
Didn't know anything about Zig, but the .* syntax alone was straight-forward and easy to understand, as is regular copying from one existing memory location to another.
I think what seems weird though is more the "... = .{...}" syntax.
This makes it look as if there was some kind of anonymous object ".{...}" in memory that you're copying from, but there isn't. It's actually just a writing operation and the .{...} itself doesn't represent anything.
Maybe that was also what the author found confusing?
I find your comment confusing.
How is that not like writing an anonymous (or more precisely inferred type) struct literal into a sized memory location ?
I like this syntax. It's like saying "give me all the members of this object". Kind of like using full-slice [:] in Python to copy a list's values instead of passing by reference.
Yeah I found this article super weird. Explaining pointers & dereferencing is reasonable, but doing it in the context of Zig specifically, like Zig is the first language to feature dereferencing, is odd. Especially since pointers are such a fundamental part of low-level (really, any) programming. Also not sure why it's posted here?
Edit: Going through this author's website, it seems like a lot of their posts are about rediscovering low-level programming concepts through Zig. Like this article, where they discover you can't compare strings directly, and you have to use memcmp:
They claim that they blog because they "find that [they] retain things better when I write about them." No problem with that. Just a little odd to see on the hn front page, I suppose.
I think this is just "discovering". While you (and I) probably discovered these things with assembly or C languages, it's perfectly reasonable or even appropriate for newer generations to have these kinds of experiences with Zig or Rust.
Absolutely, like I said, there's no problem with what this person is doing or the way that they're exploring computers, I'm just confused as to why it's being posted here.
Maybe I'm wrong and this is new to a lot of people! I have a limited perspective based on my programming journey, which winds mostly through gamedev, graphics programming, and DSP, both (typically) low level domains. But I think if the title if the article were more accurate, (e.g. "What are pointers?"), my reaction would be more clear. I'm also kinda taken aback by the "old grey beard" comment. Look at all the kids using Rust, Zig, even C and tell me that this is obscure knowledge.
When I took my first programming class at RIT, visualizing the stack, variables, and pointers was one of the first classes we had. It's one of those beginner diagrams that I feel like everyone is familiar with. But I can understand that there are programming domains with equivalent complexity which don't require that base knowledge. I apologize if I came off as elitist.
I don't think people understand how abstracted most modern developers in today's world are.
If learning Piano was an example then everyone would have like you said learn the basic of everything and then build up on it. Modern day people learn multiple different chords and somehow string them together. If you do EE you have have learned how the piano works before you start playing the piano.
Worth remembering most people in programming today start with Javascript / Python or Ruby.
At least Python can be as complex as C++, it is a matter of actually bothering to go through all major manuals in https://docs.python.org/3.
Those languages are hardly more abstract than learning Lisp or Prolog back in the day, other than (with exception of JavaScript) finally embracing dynamic compilers.
It boils down to how much one actually cares to learn and is curious to improve themselves.
Author here. I agree this is a less-than captivating piece. I write a lot about Zig and wanted something I could reference from other pieces.
But, to answer your question directly: absolutely. In addition to writing a lot about it, I maintain some popular libraries and lurk in various communities. Let me assure you, beginner memory-related questions come up _all the time_. I'd break them down into three groups:
1 - Young developers who might have a bit of experience in JavaScript or python. Not sure how they're finding their way to Zig. Maybe from HN, maybe to do game development. I think some come to Zig specifically to learn this kind of stuff (I've always believed most programmers should know C. Learning Zig gets you the same fundamentals, and has a lot of QoL stuff).
2 - Hobbyist. Often python developers, often doing embedded stuff. Might be looking to write extensions in Zig (versus having to do it in C).
3 - Old programmers who have been using higher level languages for _decades_ and need a refresher. Hey, that's me!
Also, from learning human languages, it's a well-known lesson that phrasebook-type "this means this" translations (like some here are asking, from Zig to C/Rust) are useful for quick and dirty learning good enough for one trip, but long term learning needs this kind of a direct explanation.
1. It avoids the word (or syntax in this case) getting stuck in a double-indirection state, needing you to mentally translate it from Zig to C to what it actually means every time.
2. It avoids the learner attaching the wrong nuances to the word or syntax feature, based on the translation they're given, when the language they're learning has different nuances. In other words, it helps the learner see it as its own thing, and not be unduly colored by what they already know and find easy to grasp on to (even when it's subtly wrong).
> Also, from learning human languages, it's a well-known lesson that phrasebook-type "this means this" translations (like some here are asking, from Zig to C/Rust) are useful for quick and dirty learning good enough for one trip, but long term learning needs this kind of a direct explanation.
This is also good when the user already knows the concept. Like, I'm a reasonably competent Rust user, and when I started playing around with Zig I already understood the majority of concepts at play and just needed to know how Zig spelled them. But I still needed that more in-depth explanation for concepts I was less familiar with (comptime being the main one).
The author may not know C. As OP said, it's definitely the case that people interested in Zig may not have any interest in going through C first. They may eventually have to, as Zig interfaces heavily with C code and libs, but there's nothing wrong with going Zig first.
It feels a bit performative to me: the article goes out of way not to explain it by giving the C equivalent.
Perhaps some day Zig will have replaced C and beginners will come to Zig having never touched C, and in that context this approach makes sense - after all, you wouldn't litter an introductory article on C with comparisons to Algol. But today, surely the modal Zig beginner already knows enough C that the syntax would better be explained by reference to C.
And why change something like that? The world would be a better place, IMHO, if there were less different ways to write something from language to language.
Sometimes it seems like the change is just to make it different, not better.
I think Rust could use some syntax improvements for pointer manipulation. It has suffix .await, so maybe the community is ready for suffix .* for dereference now too. Visually scanning left and right, along with the extra parentheses, makes pointer-based code in Rust worse than C++ almost. A fully left-to-right syntax would be amazing. EDIT: found https://github.com/rust-lang/rust/issues/10011 and https://github.com/rust-lang/rfcs/pull/3577
I’ve gotta say, when I first learned Rust had gone its own way on await, I was heavily sceptical. But seeing the actual examples was pretty compelling.
Yeah I was skeptical too but it makes so much more sense. Now I'm constantly thinking "this is dumb" when using other languages with the "normal" syntax.
I wish they'd got it right for (de)referencing too though.
The fact that the author of Zig who's extremely well versed in C and C++ went this way should make you at least try to think why he went that way before dismissing it.
It's remarkably intuitive and sensible if you remember that . in Zig auto-dereferences for field access and then treat `*` in this construct as field name denoting the entire object.
Ada does the same exact thing, except there you write `p.all` instead.
In any case, while the exact syntax may not be ideal, using a postfix operator for dereferencing is vastly better than prefix in practice due to typical patterns of use. There's a reason why you end up writing () a lot in C code with heavy pointer operations - the things they end up mixed with most often turn out to have the wrong kind of priority and/or precedence more often than not. Things are much simpler when everything is postfix and code reads naturally left-to-right.
(I'd suggest the latter is now the more readable of the two).
Where there is a VFT in each of the xxx_out structures, and the calls simply returns the VFT, while the whole abstraction is stored/returned via the out arguments.
In general, the rule of thumb is that prefix and postfix operators don't mix well in a single expression - order of operations is confusing, and reading the code requires going back and forth to follow it.
In the ideal world, we wouldn't have unary prefix operators at all, but unfortunately unary +, -, and NOT are prefix mostly because they were that in math notation and got grandfathered in (bonus points to Smalltalk here for going with consistency here - "not" and "negated" are regular nullary methods there so it's postfix throughout!). However, these are rarely themselves an operand of another unary operator, so you can mostly deal with this by giving postfix higher precedence than prefix in all cases, so at least it's a simple rule. But then for pointer dereference, it is in fact common to have the result of a dereference itself be an operand in the middle of another expression.
So now you have some choices to make. If you make the pointer dereference prefix, then you don't need extra parentheses when applying other prefix operators to it, e.g. -*p or !*p. If you make it postfix, then you don't need extra parentheses when applying other postfix operators to it, e.g. (using Pascal-style ^ for dereferencing) p^[0] or p^.x. Alternatively, you could add special postfix operators that desugar into the combination but avoid those extra parens, which is what C did with -> for field access.
(Technically, you could also make everything prefix instead, e.g. field access ALGOL 68 style: `month OF birthDate OF person`. But this is very counter-intuitive with indexing, and also makes code completion unusable, so it's not a serious option.)
In Python, there is no .await. But I can't remember seeing (await (await for).bar) ever--meaning it is very unlikely. It is usually written as:
bar = await foo()
await bar.coro(*args)
Some people hate that calling functions of different colors is syntactically distinct. On the contrary, I find it beneficial that suspension points stand out. Unlike code with preemptive threads that is much harder to reason about.
You will know because any text editor or IDE worth the name will light up that ".await" in such a way you will immediately know it's not a method call or a struct field. The entire construct, including the dot, is a postfix keyword.
this is not just a preference but a practical matter, esp. for reading nd checking other People's code. did you solve the puzzle at the end with high confidence?
You mean `foo = *bar[10]`? That’s equivalent to `bar[10][0][0]`, i.e. the array element access is done first, then the retrieved element is dereferenced twice. I’m quite confident in this, but I’m a systems programmer working in C++, so that’s my bread and butter.
This syntax is less arbitrary than C's. It draws a syntactic parallel between accessing a single member and accessing "all members". (by using pattern-matching-like syntax)
It makes the language more consistent and one's mental model of it smaller. (Even though I doubt that patterns other than the Kleene star would work)
I don't actually agree that it makes the mental model smaller. I see something that is just different from what I expect, is cognitively jarring, interrupts the flow of what I'm doing, and forces me to focus on something I shouldn't need to.
But it seems like I'm in the minority, so maybe it's just "old man shouts at clouds", I'll just ignore the language and move on :)
Yes, but unlike C's terribly confusing "declaration follows usage" style, which no tells you about btw, zig's pointer syntax doesn’t turn into a nightmarish puzzle.
In Zig, you can pretty much read the types aloud and it makes, your brain does not need to peek for parsing.
For me it's too late, I got used to the C way but I still want the better thing to get adopted.
No more int ((foo)(int))[5]; nonsense—just T, [N]T, or *T, making intent crystal clear.
Yeah, not sure where that came from. The ANSI C standard was extensively documented in books and articles and specs at all levels of rigor starting from the mid-80's. No one has ever lacked for a reference for how function pointer declaration looks.
That said, C's function pointer declaration syntax is indeed awful. But really that's because ANSI took a very hardline "no incompatible changes" tact when adding prototypes to the language, which limited the ways they could express them. That decision is one of the reasons we're still writing C today. Any yahoo can come up with a new language, kids do it all the time. ANSI's job was to add features to the language in which Unix was already written.
Sometimes they just don't want to be normal, that's it. And they created a language around an easier parser, not around great DX. Too much redundant "punctuation", etc.
I don't understand the snark. Is it not possible to write articles for beginners? There are also a lot of developers who have never dealt with memory allocation and are interested in Zig.
Yes it is possible to write articles for begginners, but they should be on begginner concepts.
Writing about an advanced concept catering to begginners is a mess. It's like writing about logarithmic calculus and prefacing with an explanation on exponents.
Hey I started using Zig after a decade of JS/Elm/Haskell/php/scala, such articles are very useful to me. Haven't used lower lever languages in a long time.
Makes sense, it seems like you missed languages with pointers.
I would suggest going through C. Otherwise it's like learning C++ without learning C. Zig is similarly a successor to C, (it is phonetically named after it).
That would address your blind spots and let you appreciate Zig's identity
We might still be in a period where C can be both thought of as a historical didactical language and as a language to program in, I'm seeing the pushback from people that fear C as the second kind.
With time, we will stop using C except for very specific things (it's only used for embedded, Open Source and Operating systems at this time anyways), and we will be able to focus on C as a historical and didactical step in a learning path.
This is similar to why it's appropriate to joke about killing someone but it's not appropriate to make a joke about raping someone. Or we are ok with reading about the epic of gilgamesh, or the oddyssey or beowulf, but Bible readings might face more pushback.
But I think we can all agree that learning C is a basic step in the formation of any classically trained programmer.
P.S: Talk is good, I've definitely noticed a top-down approach becoming more popular than bottom-up. But I chose to start with C as a teen, and my uni started me up with Chem, physics and maths before going into programming. Definitely two separate paths.
In other courses I don't enjoy skipping history either, I like learning calculus by learning about newton, it makes it easier to remember. I hate the gray approach of just going for the solutions and memorizing them.
Learn C so you can avoid beginner content on Zig /s.
You're making many wrong assumptions at the same time:
- that I've never used C or C++ (both wrong)
- that I don't know pointers (also wrong, nor it is a particularly difficult topic)
- that you need to go through C again to learn Zig. Have you met many people that picked up Zig as their first programming (or first system programming) language to make those statements?
Because I am in both the IRC and discord and there's plenty of people that get proficient in Zig starting from it.
However, thinking long term, are we going to continue to introduce pointers via C to new students? If not, then how? C++ or Zig seem viable options, so this might be a long term proposition.
I definitely remember casting pointers to ints and printing them to prove to myself what it was pointing to and what the offset was for the next value e.g. in an array.
In C you can also do those fun array indexing and pointer arithmetic tricks that require you truly understand the concept of "address plus offset" that basically everything uses.
But even if that's a hallucination, Rust does the hard work of keeping pointers off your mind by a combo of refs, box, etc. and Rust is not a good first language, I don't think, for a variety of reasons.
My ultimate programing curriculum if I had to make one would teach you to program in python, then show you C via Cython or similar. Lots of allocation and free under the hood in Python.
Casting pointers to ints is generally safe (at least as long as you use intptr_t instead of assuming that the size of pointers will never change).
The issue comes when you try casting to pointers. Because of providence, aliasing rules, and a couple of dragons that built a nest in the C language specification, you could have two pointers to the exact same memory locations, but have the program be undefined if you use the wrong pointer.
Granted, this doesn't stop you from doing things like
foo_t *foo = (foo_t*) 0xDEADBEEF
And in the few occasions where that is something you would reasonably want to do it does more or less what you would expect (unless you forgot that the CPU sticks a transparent cache between you and the memory bus; but not even assembly would save you from that oversight).
Provenance (outside programming this is the distinction between "I reckon this old table is a few hundred years old" and "Here is the bill of sale from when my grandfathers ancestors had the table made from the old tree where that cherry tree is now in 1620") not Providence.
In Rust pointer provenance is a well defined language feature and if you go to the language docs you can read how it interacts with your use of raw pointers and the twin APIs provided for this.
In C the main ISO document just says basically here be dragons. There's an additional TS from 2023 with better explanation, but of course your C compiler even if it implemented all of C23 needn't necessarily implement that TS. Also of course the API is naturally nowhere near as rich as Rust's. C is not a language where pointers have an "addr" method so they also don't need a separate exposure API.
I suspect that in Zig none of this is clearly specified.
Sure, I'm interested in whether any of those bugs affect say, Cranelift because the Cranelift did, as I understand it, a much better job of coming up with a coherent IR semantic so unlike LLVM fixing bugs in this layer isn't as scary if it's necessary for them.
It is definitely possible to write Rust or (with more difficulty, legal C) which should show off something about provenance semantics and instead the LLVM backend just emits contradictory "One and two are the same number" type nonsense code. In C of course they can say well, since the ISO document pointedly does not specify how this works, maybe one and two really are the same number, although nobody actually wants that - in Rust that's definitely a bug but you will just get pointed at the corresponding LLVM bug for this issue, they know it's busted but it's hard to fix.
I don't know whether fixing the LLVM bug magically makes it TS6010 compliant. If so that would be nice.
In systems languages that predated C like NEWP and PL/I variants, in Object Pascal, Modula-2, Mesa, BASIC, Ada.
The different is that some of them have the knowledge between type safe pointers, i.e. can only be created by taking adresses of existing variables, and unsafe pointers, i.e. can be created out thin air like in C.
This is a weird (to me) way of presenting Zig syntax and operation. I would generally assume that someone coming to Zig would be familiar with C/C++ or Rust and only need to be able to map Zig syntax to things they already know.
If the author's content is all in this 'assume no C background' style, it will be useful to those without previous manual memory management background in contrast to most articles that assume a lot of context.
This is the way that Pascal does it, which was always more sensible than prefix notation. Pascal uses ^
In C you can use `[0]` as a postfix dereferencing operation... ugly but it works. I use it a lot in GDB. But I agree that Pascal does it better here, the dereferencing operator should have been postfix all along.
The thing I really wish C had, and which there is no straightforward workaround for, is postfix casting. When you do `((struct foo*)(COMPLEX_EXPRESSION))->field`, I think it would read a lot better if the `(struct foo*)` cast was on the same side as `->field`. Maybe something like `(COMPLEX_EXPRESSION)@(struct foo*)->field`
That is very nice. If you treat the star as a postfix operator, then even "->" is no longer needed:
`->` shouldn't be necessary to begin with... if the LHS is a pointer already, `.` could just auto-dereference it instead of generating an error like it does now.
Note that when you define smart pointer types C++ there is indeed a difference between `.` and `*` resp. `->`: the latter are typically overloaded to operate on the underlying pointer type whereas the former operates on the smart pointer class itself (e.g. calling `reset` on a `std::unique_ptr`). Not sure how you would do that with a single dereference operator.
It would have to have been designed to use external functions instead (eg spelled std::reset), which might even make the API more consistent since many accessor functions already require doing just that (eg std::begin)
Free functions like `std::begin` are meant for generic code. Apart from that, how would you even access the private members if the dot-operator was overloaded? (C++ does not support overloading the dot-operator for good reasons.)
I always wondered about that. Is there some technical reason that it was designed the way it was? Is it disambiguating something?
It seems like `.` could in theory be made to recursively dereference any number of layers down to the base type. It's not as though `.` has any other possible meaning when used directly against an address ... right?
Historical reasons: early C compilers didn't actually keep track of what the type of the LHS was, so they needed `.` vs `->` to properly disambiguate. Later C compilers did start tracking that, so the difference in syntax was no longer necessary, but stuck for reasons of history and backward compatibility
https://retrocomputing.stackexchange.com/questions/10812/why...
Besides the sibling comment, I believe many of the finer details of Chomsky's hierarchy wasn't understood at the time, language grammars were just sort of "home-grown".
But I might be wrong here.
You should probably just cast to an intermediary `struct foo *`. It'd be a lot more readable.
In a way, Zig is Modula-2 with curly brackets for folks with C background.
Yes, I know, comptime and such.
In Pascal, it's a distinct operator from field access, though.
What this actually reminds me of is Ada, where you write `value.all` to basically the same effect as Zig's `value.*`. It's as if everything is a struct with itself as a member.
Nice explanation of zig's dereferencing operator. Thanks!
Didn't know anything about Zig, but the .* syntax alone was straight-forward and easy to understand, as is regular copying from one existing memory location to another.
I think what seems weird though is more the "... = .{...}" syntax.
This makes it look as if there was some kind of anonymous object ".{...}" in memory that you're copying from, but there isn't. It's actually just a writing operation and the .{...} itself doesn't represent anything.
Maybe that was also what the author found confusing?
I find your comment confusing. How is that not like writing an anonymous (or more precisely inferred type) struct literal into a sized memory location ?
Article says it’s about stack addresses but I think it should apply more generally to dynamically allocated memory as well.
I like this syntax. It's like saying "give me all the members of this object". Kind of like using full-slice [:] in Python to copy a list's values instead of passing by reference.
Tl;dr: It is just pointer dereferencing.
I.e. C's
is in Zig.Yeah I found this article super weird. Explaining pointers & dereferencing is reasonable, but doing it in the context of Zig specifically, like Zig is the first language to feature dereferencing, is odd. Especially since pointers are such a fundamental part of low-level (really, any) programming. Also not sure why it's posted here?
Edit: Going through this author's website, it seems like a lot of their posts are about rediscovering low-level programming concepts through Zig. Like this article, where they discover you can't compare strings directly, and you have to use memcmp:
https://www.openmymind.net/Switching-On-Strings-In-Zig/
They claim that they blog because they "find that [they] retain things better when I write about them." No problem with that. Just a little odd to see on the hn front page, I suppose.
> rediscovering
I think this is just "discovering". While you (and I) probably discovered these things with assembly or C languages, it's perfectly reasonable or even appropriate for newer generations to have these kinds of experiences with Zig or Rust.
Absolutely, like I said, there's no problem with what this person is doing or the way that they're exploring computers, I'm just confused as to why it's being posted here.
> I'm just confused as to why it's being posted here.
It's more topical than many articles posted here imo. But topical is in the eye of the beholder.
My understanding is that the only qualifier for something on hackernews is that it has to be "deeply interesting":
https://news.ycombinator.com/newswelcome.html
Deeply interesting tech, or otherwise.
Maybe the place is not reserved to old grey beard? :)
I honestly don't understand his displeasure at the article being posted here. It reeks of elitism to me.
Maybe I'm wrong and this is new to a lot of people! I have a limited perspective based on my programming journey, which winds mostly through gamedev, graphics programming, and DSP, both (typically) low level domains. But I think if the title if the article were more accurate, (e.g. "What are pointers?"), my reaction would be more clear. I'm also kinda taken aback by the "old grey beard" comment. Look at all the kids using Rust, Zig, even C and tell me that this is obscure knowledge.
When I took my first programming class at RIT, visualizing the stack, variables, and pointers was one of the first classes we had. It's one of those beginner diagrams that I feel like everyone is familiar with. But I can understand that there are programming domains with equivalent complexity which don't require that base knowledge. I apologize if I came off as elitist.
Is it?
When learning piano you first learn how to play rudiments and then you move up to more complex scores.
Otherwise you end up writing an article about how Ravel' Scarbo is amazing because it involves playing with two hands at the same time.
https://youtu.be/2BT7_owW2sU?si=VNpy3K6UXSkAsn7u
I don't think people understand how abstracted most modern developers in today's world are.
If learning Piano was an example then everyone would have like you said learn the basic of everything and then build up on it. Modern day people learn multiple different chords and somehow string them together. If you do EE you have have learned how the piano works before you start playing the piano.
Worth remembering most people in programming today start with Javascript / Python or Ruby.
At least Python can be as complex as C++, it is a matter of actually bothering to go through all major manuals in https://docs.python.org/3.
Those languages are hardly more abstract than learning Lisp or Prolog back in the day, other than (with exception of JavaScript) finally embracing dynamic compilers.
It boils down to how much one actually cares to learn and is curious to improve themselves.
There's amateur and outsider musicians for sure, I've seen them. They learn by hearing, usually listen to pop or rock, not classically trained.
More common with guitar than piano though.
Similarly they are most popular with Javascript than with C
not when you can let an ai play the notes for you
Rediscovering is a perfectly fine word for one person discovering for the first time what others have discovered before.
Yeah, I kept on reading to find out if there is some Zig-specific twist but it is literally just that.
Are there really that many Zig programmers that have never seen C or know what pointers are?
Author here. I agree this is a less-than captivating piece. I write a lot about Zig and wanted something I could reference from other pieces.
But, to answer your question directly: absolutely. In addition to writing a lot about it, I maintain some popular libraries and lurk in various communities. Let me assure you, beginner memory-related questions come up _all the time_. I'd break them down into three groups:
1 - Young developers who might have a bit of experience in JavaScript or python. Not sure how they're finding their way to Zig. Maybe from HN, maybe to do game development. I think some come to Zig specifically to learn this kind of stuff (I've always believed most programmers should know C. Learning Zig gets you the same fundamentals, and has a lot of QoL stuff).
2 - Hobbyist. Often python developers, often doing embedded stuff. Might be looking to write extensions in Zig (versus having to do it in C).
3 - Old programmers who have been using higher level languages for _decades_ and need a refresher. Hey, that's me!
Hey, that's me too!
Also, from learning human languages, it's a well-known lesson that phrasebook-type "this means this" translations (like some here are asking, from Zig to C/Rust) are useful for quick and dirty learning good enough for one trip, but long term learning needs this kind of a direct explanation.
1. It avoids the word (or syntax in this case) getting stuck in a double-indirection state, needing you to mentally translate it from Zig to C to what it actually means every time.
2. It avoids the learner attaching the wrong nuances to the word or syntax feature, based on the translation they're given, when the language they're learning has different nuances. In other words, it helps the learner see it as its own thing, and not be unduly colored by what they already know and find easy to grasp on to (even when it's subtly wrong).
> Also, from learning human languages, it's a well-known lesson that phrasebook-type "this means this" translations (like some here are asking, from Zig to C/Rust) are useful for quick and dirty learning good enough for one trip, but long term learning needs this kind of a direct explanation.
This is also good when the user already knows the concept. Like, I'm a reasonably competent Rust user, and when I started playing around with Zig I already understood the majority of concepts at play and just needed to know how Zig spelled them. But I still needed that more in-depth explanation for concepts I was less familiar with (comptime being the main one).
> Not sure how they're finding their way to Zig.
Lots of popular Youtubers such as Primeagen (somebody who easily gets 200/300k+ views per video) have been speaking highly about Zig.
Perhaps add a brief paragraph, for C folks, that draws parallels between the Zig code and C. You never know, other readers may find it useful as well.
The author may not know C. As OP said, it's definitely the case that people interested in Zig may not have any interest in going through C first. They may eventually have to, as Zig interfaces heavily with C code and libs, but there's nothing wrong with going Zig first.
I really like your articles. Zig has some pitfalls and the official documentation can be a bit sparse.
It feels a bit performative to me: the article goes out of way not to explain it by giving the C equivalent.
Perhaps some day Zig will have replaced C and beginners will come to Zig having never touched C, and in that context this approach makes sense - after all, you wouldn't litter an introductory article on C with comparisons to Algol. But today, surely the modal Zig beginner already knows enough C that the syntax would better be explained by reference to C.
I know some C but I haven't touched it in a decade, but I have started playing with Zig so this was useful to me.
You aren't always the target.
And why change something like that? The world would be a better place, IMHO, if there were less different ways to write something from language to language.
Sometimes it seems like the change is just to make it different, not better.
Same reason Rust uses foo.await instead of await foo. It's clearly superior syntax.
The whole point of Zig is to fix C's mistakes. I don't know why they'd repeat this one.
I think Rust could use some syntax improvements for pointer manipulation. It has suffix .await, so maybe the community is ready for suffix .* for dereference now too. Visually scanning left and right, along with the extra parentheses, makes pointer-based code in Rust worse than C++ almost. A fully left-to-right syntax would be amazing. EDIT: found https://github.com/rust-lang/rust/issues/10011 and https://github.com/rust-lang/rfcs/pull/3577
I’ve gotta say, when I first learned Rust had gone its own way on await, I was heavily sceptical. But seeing the actual examples was pretty compelling.
Yeah I was skeptical too but it makes so much more sense. Now I'm constantly thinking "this is dumb" when using other languages with the "normal" syntax.
I wish they'd got it right for (de)referencing too though.
At this point &var and *var are pretty much an established way across languages, like if-else. I don’t see any benefit of reinventing this paradigm.
You mean like if-else if, if-elseif, if-elsif, if-elif? I think I should use eif for my next language. You can have elsf and elf.
It’s not that important though. Many people write two or more languages without mixing the syntax.
p.* and lv.& would be good post-operators, imo. Less going left and parenthesizing.
But those are all variations of if-else, right?
The alternative to if-else would be something like `assuming-otherwise`.
The point was that there wasn’t really a pressing need to reinvent the *p/&p syntax.
The fact that the author of Zig who's extremely well versed in C and C++ went this way should make you at least try to think why he went that way before dismissing it.
Sounds like Appeal to Authority fallacy.
> Copying bad design is not good design.
—Andreas Rumpf
Agreed. And p.* is remarkably unintuitive and nonsensical compared to *p.
It's remarkably intuitive and sensible if you remember that . in Zig auto-dereferences for field access and then treat `*` in this construct as field name denoting the entire object.
Ada does the same exact thing, except there you write `p.all` instead.
In any case, while the exact syntax may not be ideal, using a postfix operator for dereferencing is vastly better than prefix in practice due to typical patterns of use. There's a reason why you end up writing () a lot in C code with heavy pointer operations - the things they end up mixed with most often turn out to have the wrong kind of priority and/or precedence more often than not. Things are much simpler when everything is postfix and code reads naturally left-to-right.
It's not at all unintuitive of nonsensical. You're just really used to *p because it what C uses.
Postfix is only superior when you have lots of things to chain. C doesn't have any chaining at all. f(g(x)) is just as good as f(g(x))
One is able to chain calls in C via:
At least one person at Bell Labs historically used that scheme for some graphics program.All it requires is that the function returns a pointer to a struct which itself contains function pointers.
C also allows the dot form if you really want it:
Simply by returning structs, and for all compilers that I know if, such end up using a hidden pointers to structs.Now to make it a bit more usable, one needs a bit of planning so that either of these can be done:
(I'd suggest the latter is now the more readable of the two).Where there is a VFT in each of the xxx_out structures, and the calls simply returns the VFT, while the whole abstraction is stored/returned via the out arguments.
Gotta escape those asterisks so you don't output italics.
f(*g(*x)) is just as good as f(g(x*)*)
Hmm, function names go before parentheses, that looks like a good reason for dereferencing to be prefix too, otherwise they end up a long way apart.
But in the back of mind I'm thinking "semantics cause pointless fussy arguments, all code is ugly, let's stop programming in text somehow".
Oh yeah, forgot HN supported some markdown.
But this is literally not semantics. Semantics is everything of value, this is just syntax.
Sorry, yes, syntax. I was equating it to "purely semantic argument", arguing about the meaning of words. Here about it's where to put symbols.
why is foo.await better?
is it a method call? is it a property/field value? how am i to be indicated that there is an implicit suspend point when you .await?
It's the difference between (await (await foo).bar).baz and foo.await.bar.await.baz -- which do you want to write?
In general, the rule of thumb is that prefix and postfix operators don't mix well in a single expression - order of operations is confusing, and reading the code requires going back and forth to follow it.
In the ideal world, we wouldn't have unary prefix operators at all, but unfortunately unary +, -, and NOT are prefix mostly because they were that in math notation and got grandfathered in (bonus points to Smalltalk here for going with consistency here - "not" and "negated" are regular nullary methods there so it's postfix throughout!). However, these are rarely themselves an operand of another unary operator, so you can mostly deal with this by giving postfix higher precedence than prefix in all cases, so at least it's a simple rule. But then for pointer dereference, it is in fact common to have the result of a dereference itself be an operand in the middle of another expression.
So now you have some choices to make. If you make the pointer dereference prefix, then you don't need extra parentheses when applying other prefix operators to it, e.g. -*p or !*p. If you make it postfix, then you don't need extra parentheses when applying other postfix operators to it, e.g. (using Pascal-style ^ for dereferencing) p^[0] or p^.x. Alternatively, you could add special postfix operators that desugar into the combination but avoid those extra parens, which is what C did with -> for field access.
(Technically, you could also make everything prefix instead, e.g. field access ALGOL 68 style: `month OF birthDate OF person`. But this is very counter-intuitive with indexing, and also makes code completion unusable, so it's not a serious option.)
In Python, there is no .await. But I can't remember seeing (await (await for).bar) ever--meaning it is very unlikely. It is usually written as:
Some people hate that calling functions of different colors is syntactically distinct. On the contrary, I find it beneficial that suspension points stand out. Unlike code with preemptive threads that is much harder to reason about.Is this something you write often in Rust? A promise returning a struct with a promise returning a struct with a promise.
Although I agree that await could use some precedence like new in javascript:
I’d rather parenthesize what I’m awaiting in complex cases than awaiting just everything that follows.> which do you want to write?
Neither, honestly. They both look terrible.
It preserves a left to right reading of the code. You don't have to jump visually elsewhere.
await is a keyword so it can't be a field. Same way you know if(foo) isn't a function call.
You will know because any text editor or IDE worth the name will light up that ".await" in such a way you will immediately know it's not a method call or a struct field. The entire construct, including the dot, is a postfix keyword.
oh no. this is DEFINITELY a place where zig made a GOOD change away from c.
when a value is dereferenced, having a consistent left to right dereferencing makes sense.
for example, in c, without looking up the order of operations how confident are you that you know what is going on in the following:
foo = **bar[10]
If prefix * bugs you, C has a suffix deference operator as well, `[0]`.
(Yes, C programmers will probably hunt you down if you do this, but it does work.)
this is not just a preference but a practical matter, esp. for reading nd checking other People's code. did you solve the puzzle at the end with high confidence?
You mean `foo = *bar[10]`? That’s equivalent to `bar[10][0][0]`, i.e. the array element access is done first, then the retrieved element is dereferenced twice. I’m quite confident in this, but I’m a systems programmer working in C++, so that’s my bread and butter.
it's hard to disambiguate from bar[0][0][10]...
Sure, I see your point and suffix does seem like the better option. But for people who do a lot of C, it’s not an issue.
It's not just different.
This syntax is less arbitrary than C's. It draws a syntactic parallel between accessing a single member and accessing "all members". (by using pattern-matching-like syntax)
It makes the language more consistent and one's mental model of it smaller. (Even though I doubt that patterns other than the Kleene star would work)
A parallel with files:
In Zig you can In C:I don't actually agree that it makes the mental model smaller. I see something that is just different from what I expect, is cognitively jarring, interrupts the flow of what I'm doing, and forces me to focus on something I shouldn't need to.
But it seems like I'm in the minority, so maybe it's just "old man shouts at clouds", I'll just ignore the language and move on :)
Yes, but unlike C's terribly confusing "declaration follows usage" style, which no tells you about btw, zig's pointer syntax doesn’t turn into a nightmarish puzzle.
In Zig, you can pretty much read the types aloud and it makes, your brain does not need to peek for parsing. For me it's too late, I got used to the C way but I still want the better thing to get adopted. No more int ((foo)(int))[5]; nonsense—just T, [N]T, or *T, making intent crystal clear.
> Yes, but unlike C's terribly confusing "declaration follows usage" style, which no tells you about btw
I happen to be reading the K&R C book right now and they do in fact tell you this.
Well it's been lost on every C online tutorial I guess
I have not read it so that's on me I guess
Yeah, not sure where that came from. The ANSI C standard was extensively documented in books and articles and specs at all levels of rigor starting from the mid-80's. No one has ever lacked for a reference for how function pointer declaration looks.
That said, C's function pointer declaration syntax is indeed awful. But really that's because ANSI took a very hardline "no incompatible changes" tact when adding prototypes to the language, which limited the ways they could express them. That decision is one of the reasons we're still writing C today. Any yahoo can come up with a new language, kids do it all the time. ANSI's job was to add features to the language in which Unix was already written.
I wonder why they went that route. At this point in my career the first notation is ingrained in my head. The bottom looks like regex.
So.. is a double dereference..
Yes
FTA: “power.* is how we dereference a pointer“
This Is basically the gist of the article.
I’m surprised to see that so many programmers these days don’t know C and basic pointers.
C is not even in the top dozen most popular languages https://www.jetbrains.com/lp/devecosystem-data-playground/
Treating everything a popularity contest is definitely not the best way, right?
Why would you?
Large parts of the industry does really work with low(er) level languages.
Even people who went through C/C++ in their formal education/start of their career may have not used it for a long time.
Learning C is a time honored tradition. I’m surprised that I’ve to defend C language on a forum like HN.
Everyone who knows pointers had to learn them at some point.
Are you honestly surprised?
Good compiler, poor syntax. I'm waiting for the TypeScript equivalent for Zig.
Do you find it poor just because it doesn't align with your experience and bias you hold from writing other languages?
Sometimes they just don't want to be normal, that's it. And they created a language around an easier parser, not around great DX. Too much redundant "punctuation", etc.
Have you looked at C3? Its goals align with Zig but it’s much closer to C in terms of syntax.
Probably a nitpick but
Didn't read, got bored at having pointers explained, even if actively trying to skip it.
I get a mix of: 1- you don't know your audience or are trying to cater to too wide of an audience
2- if someone doesn't know how pointers work, is there any merit in knowing syntax of a novel programming language?
So if you want the veterans to read this, you are going to have to make it less accessible.
Edit: it seems like the whole article is OP discovering pointers but zig is like his first language. Lol.
I don't understand the snark. Is it not possible to write articles for beginners? There are also a lot of developers who have never dealt with memory allocation and are interested in Zig.
Yes it is possible to write articles for begginners, but they should be on begginner concepts.
Writing about an advanced concept catering to begginners is a mess. It's like writing about logarithmic calculus and prefacing with an explanation on exponents.
Hey I started using Zig after a decade of JS/Elm/Haskell/php/scala, such articles are very useful to me. Haven't used lower lever languages in a long time.
You aren't always the audience, it's fine.
This is the HN caveat.
Makes sense, it seems like you missed languages with pointers.
I would suggest going through C. Otherwise it's like learning C++ without learning C. Zig is similarly a successor to C, (it is phonetically named after it).
That would address your blind spots and let you appreciate Zig's identity
Since at least C++14, no one should be learning C as means to learn C++, unless they want to keep writing security exploits on regular basis.
CppCon 2015: “Stop Teaching C"
https://www.youtube.com/watch?v=YnWhqhNdYyk
We might still be in a period where C can be both thought of as a historical didactical language and as a language to program in, I'm seeing the pushback from people that fear C as the second kind.
With time, we will stop using C except for very specific things (it's only used for embedded, Open Source and Operating systems at this time anyways), and we will be able to focus on C as a historical and didactical step in a learning path.
This is similar to why it's appropriate to joke about killing someone but it's not appropriate to make a joke about raping someone. Or we are ok with reading about the epic of gilgamesh, or the oddyssey or beowulf, but Bible readings might face more pushback.
But I think we can all agree that learning C is a basic step in the formation of any classically trained programmer.
P.S: Talk is good, I've definitely noticed a top-down approach becoming more popular than bottom-up. But I chose to start with C as a teen, and my uni started me up with Chem, physics and maths before going into programming. Definitely two separate paths.
In other courses I don't enjoy skipping history either, I like learning calculus by learning about newton, it makes it easier to remember. I hate the gray approach of just going for the solutions and memorizing them.
Learn C so you can avoid beginner content on Zig /s.
You're making many wrong assumptions at the same time:
- that I've never used C or C++ (both wrong)
- that I don't know pointers (also wrong, nor it is a particularly difficult topic)
- that you need to go through C again to learn Zig. Have you met many people that picked up Zig as their first programming (or first system programming) language to make those statements?
Because I am in both the IRC and discord and there's plenty of people that get proficient in Zig starting from it.
That's the impression I got too.
However, thinking long term, are we going to continue to introduce pointers via C to new students? If not, then how? C++ or Zig seem viable options, so this might be a long term proposition.
Inb4 there won't be programmers
It's an interesting question.
I first don't see a problem with C being a basal part of the curriculum for centuries, we are already at half a century going strong.
Second, if this were to happen, it should happen with a mature language, and decided consciously by seasoned professionals.
Third. I think more than looking at other languges we should look into other types of pointers, like files, hyperlinks, assembly addresses, inodes.
To date C is the standard for pointer-pointer syntax, introducing new standards is fine, but it only makes sense with the standard syntax.
It's the canon of programming. Long live C
Rust would be an option as well
Rust does not in fact have a good way to introduce raw pointers that can be passed around, and can also be treated as addresses.
yes, it does https://doc.rust-lang.org/book/ch20-01-unsafe-rust.html
On top of those fundamentals, there have been more recent developments:
https://doc.rust-lang.org/std/ptr/index.html#strict-provenan...
https://www.ralfj.de/blog/2018/07/24/pointers-and-bytes.html
https://www.ralfj.de/blog/2020/12/14/provenance.html
https://www.ralfj.de/blog/2022/04/11/provenance-exposed.html
Sorry, in what language can you treat pointers as just addresses? Sure, assembly, but beyond that. Maybe Pascal?
Everything else has provenance rules and magic unrepresented metadata around it that makes a pointer more than just an address.
I definitely remember casting pointers to ints and printing them to prove to myself what it was pointing to and what the offset was for the next value e.g. in an array.
In C you can also do those fun array indexing and pointer arithmetic tricks that require you truly understand the concept of "address plus offset" that basically everything uses.
But even if that's a hallucination, Rust does the hard work of keeping pointers off your mind by a combo of refs, box, etc. and Rust is not a good first language, I don't think, for a variety of reasons.
My ultimate programing curriculum if I had to make one would teach you to program in python, then show you C via Cython or similar. Lots of allocation and free under the hood in Python.
Casting pointers to ints is generally safe (at least as long as you use intptr_t instead of assuming that the size of pointers will never change).
The issue comes when you try casting to pointers. Because of providence, aliasing rules, and a couple of dragons that built a nest in the C language specification, you could have two pointers to the exact same memory locations, but have the program be undefined if you use the wrong pointer.
Granted, this doesn't stop you from doing things like
And in the few occasions where that is something you would reasonably want to do it does more or less what you would expect (unless you forgot that the CPU sticks a transparent cache between you and the memory bus; but not even assembly would save you from that oversight).Provenance (outside programming this is the distinction between "I reckon this old table is a few hundred years old" and "Here is the bill of sale from when my grandfathers ancestors had the table made from the old tree where that cherry tree is now in 1620") not Providence.
In Rust pointer provenance is a well defined language feature and if you go to the language docs you can read how it interacts with your use of raw pointers and the twin APIs provided for this.
In C the main ISO document just says basically here be dragons. There's an additional TS from 2023 with better explanation, but of course your C compiler even if it implemented all of C23 needn't necessarily implement that TS. Also of course the API is naturally nowhere near as rich as Rust's. C is not a language where pointers have an "addr" method so they also don't need a separate exposure API.
I suspect that in Zig none of this is clearly specified.
LLVM has optimization not consistent with TS 6010 or any consistent model and this should affect Rust as well.
Sure, I'm interested in whether any of those bugs affect say, Cranelift because the Cranelift did, as I understand it, a much better job of coming up with a coherent IR semantic so unlike LLVM fixing bugs in this layer isn't as scary if it's necessary for them.
It is definitely possible to write Rust or (with more difficulty, legal C) which should show off something about provenance semantics and instead the LLVM backend just emits contradictory "One and two are the same number" type nonsense code. In C of course they can say well, since the ISO document pointedly does not specify how this works, maybe one and two really are the same number, although nobody actually wants that - in Rust that's definitely a bug but you will just get pointed at the corresponding LLVM bug for this issue, they know it's busted but it's hard to fix.
I don't know whether fixing the LLVM bug magically makes it TS6010 compliant. If so that would be nice.
In systems languages that predated C like NEWP and PL/I variants, in Object Pascal, Modula-2, Mesa, BASIC, Ada.
The different is that some of them have the knowledge between type safe pointers, i.e. can only be created by taking adresses of existing variables, and unsafe pointers, i.e. can be created out thin air like in C.
Type (and word length as a derivate of it) anything else?
Doesn't seem hidden to me
Not at all, rust is more a successor to C than a replacement.