About that char datatype..
So, I was helping HanaSch earlier today with some code around CS50x, module 4, which is basically walking through a file, and learning about oldschool C style file pointers and suchlike, and I was roughly of the mind that the code that was provided as part of the module for the JFIF header detection was wonky.. turns out it’s not, but that’s not entirely the answer.
File digestion in C
For a standard file read in, spit out kind of operation this is sort of what I’ve done for aaages now. It’s been a while since I had to do file ops in C (don’t find much use for files on an AVR…) but this is basically muscle memory these days:
and compiled and invoked thusly on my little WSL/debian setup:
The error crystalises..
The ‘do real work stuff’ here placeholder should’ve had some JFIF header detection which relies (and it’s true in the sample file that’s being used) on the headers and data being aligned on 512 byte boundaries, and just naively looks for 0xFF 0xD8 0xFF (the SOI, see the JPEG article on wikipedia). There’s nothing wrong with this, and by all damn rights it should work… but it didn’t and so for step 1 I cracked open the file in a hex editor, and stripped down the problem to bare basics. Here’s the output:
Wait. What? What the actual flip? Why and when did characters get to be wha?
The first thing I did was to apply an ugly 0x000000ff
mask to get it working but that felt (and is) so very very wrong, that I wanted to know what I’m screwing up here and so I did some digging based on a gut feel about something being wrong with the fread
call, which takes us to IEEE Std 1003.1-2017, which has an identical example with the same basic pattern.
So. At this point, being thoroughly confused I took a brief break.. and then it hit me.
Unsigned. FML. It’s an unsigned char I need.
It turns out that for the longest time, I’ve been making the assumption that char == unsigned char, and that at some point, that assumption broke. Allow me to demonstrate:
So yes. Use an unsigned char
when you want one.