Now you know enough to make your first real, useful program: a command-line PNG file tool.

PNG is a very common image file format these days. PNG files are split into several chunks. You will make a tool that:

• can read the PNG file format and show all the chunks
• can extract basic information about the file and display it
• can extract textual information from the file and display it
• (and for extra credit) can add new textual information to the file.

## The PNG file format

Many binary file formats, including PNG, are chunked. The file is split into several pieces, and each piece has an identifier and a length.

A very important point:

PNG is a BIG-ENDIAN file format. Thoth, and most computers today, is a little-endian machine. This means that any integers bigger than a byte that you read in must be byte-swapped. This is simple to do and is explained in the project details.

### The overall format

Any PNG file will look like this:

The file signature is a sequence of 8 bytes that uniquely identifies this as a PNG file. It is the following array of bytes (NOT a zero-terminated string!):

137, 'P', 'N', 'G', '\r', '\n', 26, '\n'


After the file signature comes a sequence of chunks. The IHDR chunk is always the first, and the IEND chunk is always the last.

### Chunk format

Every chunk (including IHDR and IEND) has this same layout:

The first 4 bytes are the length, as an unsigned BIG-ENDIAN!!!!!!!! integer.

The next 4 bytes are the identifier; this is a sequence of 4 ASCII characters, but not zero-terminated. This is something like IHDR.

Then there are length bytes of data. length can be 0, in which case there are… no data bytes!

Finally, the last 4 bytes are the CRC. We’ll ignore this, but you’ll have to skip over it when you read the file. (The CRC is used to detect file corruption.)

Here is an example chunk, showing the bytes in the order they appear in the file in hexadecimal:

## Your program and the starting point

Right click this link and download to get the starting code. All I’ve done is implement the boring argument parsing and given you some helpful utility functions.

Also here are some test PNG files:

Read the comments on the little tiny functions at the top of the file! They’re useful!

Compile like so:

gcc -Wall -Werror --std=c99 -g -o readpng readpng.c


You will be able to run it in four different ways:

./readpng


If you try those right now, they’ll say they’re unimplemented and exit. You have to implement the command functions: show_info, dump_chunks, and show_text.

## 1. Open the file and check the file signature!

From now on, whenever I tell you to do something, make a function for it. You will be graded on coding style. This is not CS 0007. You cannot put everything in main. Don’t try it.

Make a new function. This function needs to take the filename and return a FILE*. It should:

• open the file for reading binary data
• check if opening the file failed, and if so, print an error message and exit
• use exit(1) to exit the program.
• read the PNG file signature into an array of 8 characters
• use strneq to compare it to the PNG_SIGNATURE I gave you
• and if they’re not equal, give an error and exit.
• return the opened FILE*. (don’t fclose it… yet.)

Then, call that function in show_info using the filename argument that was given.

Now compile and test it!!

For example:

• ./readpng readpng.c should say that it is not a valid PNG file.
• ./readpng cookiebear.png should say nothing, because it is valid.

## 2. Showing the file info (show_info)

Make a function to read a chunk’s header (the length and identifier fields). Tips:

• make a struct to represent a chunk header.
• use unsigned int for the length field.
• the identifier field is an array of 4 characters. no zero terminator.
• have the function fread that header, and then byte-swap the length field.
• you can byte-swap a value with the bswap32 I gave you:
• value = bswap32(value);
• have the function return the chunk header struct.

The first chunk in the file (right after the file signature) should be the IHDR chunk. Since you just fread‘ed the file signature, the file position is already in the right place to start reading the chunk, so no fseek is necessary.

In show_info, after opening the file, use that function you just made to read the first chunk.

BEFORE YOU CONTINUE, test that this code works:

• print the chunk length; it should be 13.
• print the chunk identifier using this printf specifier: %.4s; it should say IHDR.

Once you’ve verified that’s working, you can read the actual file info. The 13 data bytes that follow the chunk’s header are:

• (4 bytes) width (unsigned)
• (4 bytes) height (unsigned)
• (1 byte) color bit depth
• (1 byte) color type
• (1 byte) compression (must be 0)
• (1 byte) filtering (must be 0)
• (1 byte) interlaced (0 or 1)

Make a struct for this and make a function to fread a copy of that struct. Don’t forget to byte-swap the width and height after reading!

Think back to the sizeof() exploration you did in lab 2. Use that knowledge to pick appropriate types for each field. Remember, char is not only for text.

Even though you only put 13 bytes of fields in this struct, it will end up as 16 bytes. (Why?) Because of that, when you use fread() to read an instance of this struct from the file, use a constant 13, not sizeof(). Otherwise, your file position will get out of place.

Now you can print out the info. Notes:

• The width and height can be printed using the %u specifier.
• The bit depth can be printed with %d.
• even though it’s 1 byte, we’re treating it as an integer, not a text character!
• The color type can be any of the following values, which mean:
• 0: Grayscale
• 2: RGB
• 3: Indexed
• 4: Grayscale + Alpha
• 6: RGB + Alpha

When you run it as ./readpng cookiebear.png, it should look like:

File info:
Dimensions: 512 x 443
Bit depth: 8
Color type: RGB + Alpha
Interlaced: no


For graybear.png:

File info:
Dimensions: 512 x 443
Bit depth: 8
Color type: Grayscale + Alpha
Interlaced: no


And for has_text.png:

File info:
Dimensions: 32 x 32
Bit depth: 4
Color type: Grayscale
Interlaced: no


## 3. Showing all the chunks (dump_chunks)

Now that you have a function to read a chunk header, this one should be straightforward.

dump_chunks should work like this:

1. open and check the file, like before.
2. in a loop:
2. print the chunk’s type and length. (remember, %.4s)
3. if the chunk is an IEND chunk, exit the loop.
4. otherwise, skip the chunk (read below).

Look at the diagram showing how chunks are laid out. You already read the length and identifier; now you need to skip the data and CRC. This can be done in one line of code.

Don’t forget: Ctrl+C stops a runaway program!

Done correctly, the outputs on the three test files should be:

• ./readpng cookiebear.png dump:
'IHDR' (length = 13)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 2622)
'IEND' (length = 0)

• ./readpng graybear.png dump:
'IHDR' (length = 13)
'pHYs' (length = 9)
'tEXt' (length = 27)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 8192)
'IDAT' (length = 392)
'IEND' (length = 0)

• ./readpng has_text.png dump:
'IHDR' (length = 13)
'gAMA' (length = 4)
'tEXt' (length = 14)
'tEXt' (length = 49)
'tEXt' (length = 56)
'tEXt' (length = 251)
'tEXt' (length = 57)
'tEXt' (length = 20)
'IDAT' (length = 200)
'IEND' (length = 0)


This is the beauty of chunked file formats: you don’t even have to know what most of these chunks are! But you can easily see the structure and find the things you do care about.

## 4. Extracting textual data (show_text)

You can see above that has_text.png has some chunks with the tEXt identifier. These are used to embed human-readable information in the file. Think things like keywords, descriptions, copyright info, and so on.

This mode will be pretty simple:

1. open and check the file, like before.
2. in a loop:
2. if it’s an IEND chunk, stop.
3. if it’s a tEXt chunk, read it and display the name and value.
4. skip any other chunks.

Each tEXt chunk is a sort of “key-value pair”; a name which says what the text is, and a value which is the actual text. These chunks’ data looks like this:

The chunk’s length includes the length of the name, the zero terminator in the middle, and the length of the value.

Since these chunks can be any length, you will have to dynamically allocate space to hold the text data for printing. If you use malloc, don’t forget what you have to do when you’re done with that space!

Notes:

• I gave you the MAX_REASONABLE_TEXT_CHUNK_SIZE constant so you can check for that and give an error if the text chunk is too big. (Maybe it’s a corrupt file?)
• The value in the file has no zero terminator, but you need one to print it out. So how big should you make your space?
• And then what should you do after reading in the chunk’s data?
• Think carefully about what parameters you will pass to fread.
• If you use malloc, it gives you a pointer already. You don’t have to use & on that.
• sizeof is a compile-time operator. It will never give the length of an array that a pointer points to.
• Once you’ve read the chunk’s data, you can easily print the name. But how do you print the value?
• You can find out how long the name is…
• And use some pointer arithmetic…

Done correctly, using ./readpng has_text.png text should show something like:

Title:
PngSuite

Author:
Willem A.J. van Schaik
(willem@schaik.com)

Copyright Willem van Schaik, Singapore 1995-96

Description:
A compilation of a set of images created to test the
various color-types of the PNG format. Included are
black&white, color, paletted, with alpha channel, with
transparency formats. All bit-depths allowed according
to the spec are present.

Software:
Created on a NeXTstation color using "pnmtopng".

Disclaimer:
Freeware.


Neither of the other two images has any text.

## Extra credit (up to +10 points)

If you do the extra credit, please put a comment at the top of your source code telling the grader that you implemented it.

Since this is a chunked file format, it’s pretty straightforward to add additional chunks into the file. The PNG format is also forgiving about the location of tEXt chunks.

For extra credit, implement a new mode that works like this:

./readpng somefile.png add Name "This is a value for that text"


and this would modify somefile.png by adding a new tEXt chunk where the name is “Name” and the value is “This is a value for that text”.

When you use “quotes” on the command line, the stuff in quotes will be a single item in argv, so don’t worry about having to handle that.

Notes:

• You’ll have to modify the Mode enum, add a case to parse_arguments for when there are 5 arguments, and modify main to call a new function.
• Also, the name must be 1 to 79 characters long, according to the PNG spec.
• You’ll need to open the file for reading and writing.
• The easiest place to put the new chunk would be at the end…
• but remember, you need an IEND chunk after it.
• For the CRC fields of your new tEXt chunk and IEND chunk, just write 0s.

We will be compiling your programs with the following options, so be sure to compile with them while you develop as well:

\$ gcc -Wall -Werror --std=c99 -g -o readpng readpng.c

• [5] Submitted properly
• You’ll lose all 5 points if not submitted correctly.
• [5] Compiles and runs

Never turn in a program that doesn’t compile.

• [10] Code style
• make good functions. never copy and paste something that could be a function.
• use structs where they make sense.
• [10] Opens the file and checks the signature
• [4] Checks that the file was successfully opened
• [6] Correctly checks the file signature
• [30] ./readpng file Reads and displays the basic file information
• [10] Uses an appropriate structure to represent that info
• [10] Reads that structure in at once and byteswaps the right fields
• [10] Displays the info in a nice, human-readable way
• [20] ./readpng file dump Lists all chunks in the file
• [10] Correctly skips over chunks and stops at IEND
• [10] Displays the chunks properly
• [20] ./readpng file text Displays any textual data in the file
• [5] Correctly reads and zero-terminates the data
• [5] Correctly allocates (and possibly deallocates) space
• [10] Displays both the name and the value

## Submission

Name your file with proj1, like abc123_proj1.tar.gz. proj1. Not project1. Not readpng. Not proj01. proj1. proj1. proj1. proj1. proj1. proj1. proj1. proj1. proj1. proj1. proj1. proj1.

Submit ONLY YOUR readpng.c FILE INSIDE A TAR FILE NAMED abc123_proj1.tar.gz.

Please don’t include the PNG files in your submission. They’ll waste a bunch of AFS space.

You can make a new directory and copy your readpng.c file into there, and then tar that directory.