Why Do Files Have Extensions?

So I was talking to Susan earlier. She was telling me about this file she saw. She says the file had extensions. Extensions!

I knew it was fake. Nothing real looks that good – except mine, of course.

Ok, ok, let’s cut it with the hair jokes… you get it? Cut? Ha ha ha.

This article is about hair file extensions. You know, like world_domination_plan.txt, or perhaps mahcatcat.jpeg.

File Extensions

File extensions exist so we can know the types of files. For example, a .doc file is a document, .jpeg is an image, and so on. The extension lets us know what is in there.

Furthermore, the extension lets the computer know what kind of file it is. Scenario time. You are browsing through pictures of your cat, and double-click one of them to open it. Only certain programs can show pictures, though – iTunes isn’t going to help you and your cat bond here.

The file browser needs to know what program to open. File extensions to the rescue! The file browser sees the file has a .jpeg extension. That means it is an image file. So the file browser uses an image viewer to open the file. Similarly, if it encounters a .txt file it will use notepad to view it.

These are called file associations. Each file extension is associated to a specific program that handles it. Multiple file extensions can be handled by the same program, like Microsoft Word handling both .doc and .docx.

Fun fact: Files can exist without any extension at all. The file browser doesn’t know how to open them, though. They can also have multiple extensions, sort of.

The extension of a file is only the part after the last period. For example, if I have a file named “file.txt.doc.jpeg.png.txt.exe,” it is a .exe file. All the other parts are the file name itself. Sometimes this is used when a file has been compressed multiple times. For example, a file “file.tar.gz” has been compressed twice. To get the original file, we would unzip it using gzip, yielding “file.tar”. Then we unzip it with tar, yielding the original files.

Extensions And Security

As I said above, only the last extension is the actual extension. This has some security implications. Windows and OS X hide extensions of known (common) file types by default. For example, “cat.jpeg” shows as simply “cat.” It still opens fine, the file browser just doesn’t show the extension after it.

What happens if I name a file “cat.jpeg.exe?” It shows as cat.jpeg. But it isn’t a .jpeg file, it is a .exe file. If you try to open it, it will not show you a picture of your cat – it will run a program.

But it looks like an image file. So people will open it and bad things will happen because it is a virus.

8.3 Filenames

You have likely noticed that most extensions only have three letters. Why is that? The reason is because of length limits. A long, long time ago – in this galaxy – the file name could only be eight letters and the extension could only be three. That is why so many common extensions are three letters.

The restriction isn’t around anymore, but it is still a pretty common convention. There are some file types like .htm and .html that both designate the same format. Newer file formats may also make use of a longer extension, like .resources. This helps prevent collisions where two different file formats use the same extension.

File Contents

I will let you in on a little secret. Come closer, so I can tell you. Closer still… whoa, there, way too close. Ever heard of personal space?

Ok, here it is. The file extension doesn’t affect the file contents at all. They have no relation to each other whatsoever.

But wait, you say. When I try to change the file extension on a file, windows says I might break stuff. It doesn’t changed the contents of the file at all, though, just the extension. As a test, you can change the extension of a file, change it back, then see if the file still works. It will.

Why does windows say things will break? Things do break, sort of, but not the file itself. The problem is twofold. When you change the extension, the program that used to open it will not be used anymore. Instead, the program that is associated with the new extension will open it. It will fail, because while the extension is one the program can open, the file contents aren’t formatted for that extension. This is what windows means by things breaking. Again, the file is unchanged – you just cannot open it until you change the extension back.

The file extension denotes the type of the file. The type represents a certain way to store information. For example, images consists of pixels. Each pixel consists of a red, green, and blue value, each a number from 0-255. There are multiple formats you can store an image in – .bmp, .png, .jpeg, and so on.

The program opening the file must be able to “understand” the format of the file. If the data isn’t formatted correctly, it will not be able to read the data. The extension indicates what format the data is stored in. If the extension is .jpeg, the program will assume it is in jpeg format. If it isn’t, it will fail to open the file.

When you change the extension of a file, you don’t change the formatting of its contents, which is why it is then possible to change the extension back and have an intact file.

It is important to note how the extension is changed. So far I have gone over changing the file extension itself through a file browser program. This does not change the contents. However, changing the extension from within a program does.

When you have the option to save as in a program it allows you to pick from multiple formats for the saved file. For example, an image editor can save a file as both a .png and a .jpeg. This conversion is typically done by changing the extension of the filename when saving it. When this is done the extension and the format are changed, because the programs formats it according to the extension you gave it. In this case, the file contents are changed.

 

Jacob Clarity

 

Leave a Reply