Sound refers to a type of energy that causes particles to vibrate as it travels through a medium other than a vacuum (which contains no particles). This can then be converted to its electrical counterpart - a signal, which can be the equivalent of many types of energy. The aforementioned process is done using a transducer such as a microphone, which converts from one type of energy to another. When transmitted as energy the data is analogue, but requires conversion to digital data to be stored in memory on a computer. Analogue data continuously changes between an infinite amount of values, whilst digital data is discrete and as such has only fixed values which it can vary between.
In order to be converted to digital data the analogue sound is sampled at periodic intervals (twice every second for example), using a process called Pulse Amplitude Modulation. Quality is lost however, because not of the data from the analogue sound has been retained, and the values (e.g. loudness) of the samples will have been rounded using a process called quantisation, and stored in binary. These numbers represent Pulse Code Modulation, and each value is stored in sequence in a binary file. When sampling two factors are considered - the sampling rate which refers to the number of times an audio track is sampled per second, and the sampling resolution, which is the number of binary digits allocated to the range of possible values for the discrete data.
Frequency refers to the amount of complete waves per second, and is measured in Hz, kHz (thousands), and MHz (millions). According to Harry Nyquist, a famous American electrical engineer, an audio recording must be sampled at a rate of twice the frequency of the actual recording or higher in order to retain its quality. In order to calculate how large the file will be after quantisation when the minimum amount of samples is taken, one needs to multiply the frequency by 2, and then by the resolution. This gives the amount in bits, which can be converted to bytes, kilobytes and megabytes.
In order to play the music stored using speakers, the digital data must be converted back to analogue. As a result of only having samples, the computer is required guess the missing values, by looking at the values they are between. Graphically this would seem like a best fit line between the discrete steps. This often results in the audio sounding different to when it was originally created.
On computers sound may be stored in a variety of different file formats which vary in quality. The most common is WAV, which is on average 2.5 MB of data per minute of audio. MPEG formats on the other hand do not retain data concerning frequencies which cannot be interpreted by the human brain. As such they can be only 10% of the size of the original audio recording. When stored digitally it is also much easier for us to edit and alter recordings in certain ways, such as adding effects. This makes changing music much easier than it was before these advanced techniques, when each audio track was recorded on a separate tape, and if there was a mistake the recording had to be recreated.
Sound may also be synthesised using a MIDI (Music Information Digital Interface), which gives the computer instructions about what exact sound to make, including instructions about factors such as its pitch or loudness. This is like vector graphics, as the exact instructions used to create the file must be given. As such this technology cannot be used to create a copy of an existing audio recording. The file size is much smaller however, because it is not the audio recording which is stored, but the instructions used to create it.
One last way that audio can be used on computers is through the use of streaming technology. This involves buffering sound in packets over a network, or for the most part the internet. Parts of the sound recording are sent in small amounts of bits, and are discarded after being played. Although this music cannot be stored on a hard disk, it is more difficult to be copied meaning that copyright is protected, and cannot be listened to when the computer cannot contact the server. It is also affected by bandwidth, because if only a small amount of bits cannot be sent per second then the audio track will often need to stop, wait until more data has been received and then resume.