**Lossless compression**: reducing the amount of computer storage space needed to store data without needing to remove or irreversibly alter any of this data in doing so. In other words, decompressing losslessly compressed information returns exactly the original data.**Lossy compression**: like lossless compression, but instead removing, irreversibly altering, or only approximating information for the purpose of further reducing the amount of computer storage space needed. In other words, decompressing lossy compressed information returns an approximation of the original data.**Block**: A (short) section of linear pulse-code modulated audio with one or more channels.**Subblock**: All samples within a corresponding block for one channel. One or more subblocks form a block, and all subblocks in a certain block contain the same number of samples.**Frame**: A frame header, one or more subframes, and a frame footer. It encodes the contents of a corresponding block.**Subframe**: An encoded subblock. All subframes within a frame code for the same number of samples. When interchannel decorrelation is used, a subframe can correspond to either the (per-sample) average of two subblocks or the (per-sample) difference between two subblocks, instead of to a subblock directly, see. **Interchannel samples**: A sample count that applies to all channels. For example, one second of 44.1 kHz audio has 44100 interchannel samples, meaning each channel has that number of samples.**Block size**: The number of interchannel samples contained in a block or coded in a frame.**Bit depth**or**bits per sample**: the number of bits used to contain each sample. This MUST be the same for all subblocks in a block but MAY be different for different subframes in a frame because of interchannel decorrelation. (Seefor details on interchannel decorrelation) **Predictor**: a model used to predict samples in an audio signal based on past samples. FLAC uses such predictors to remove redundancy in a signal in order to be able to compress it.**Linear predictor**: a predictor using linear prediction (see). This is also called **linear predictive coding (LPC)**. With a linear predictor, each prediction is a linear combination of past samples, hence the name. A linear predictor has a causal discrete-time finite impulse response (see). **Muxing**: short for multiplexing, combining several streams or files into a single stream or file. In the context of this document, muxing more specifically refers to embedding a FLAC stream in a container as described in. **Fixed predictor**: a linear predictor in which the model parameters are the same across all FLAC files, and thus do not need to be stored.**Predictor order**: the number of past samples that a predictor uses. For example, a 4th order predictor uses the 4 samples directly preceding a certain sample to predict it. In FLAC, samples used in a predictor are always consecutive, and are always the samples directly before the sample that is being predicted.**Residual**: The audio signal that remains after a predictor has been subtracted from a subblock. If the predictor has been able to remove redundancy from the signal, the samples of the remaining signal (the**residual samples**) will have, on average, a smaller numerical value than the original signal.**Rice code**: A variable-length code (see) that compresses data by making use of the observation that, after using an effective predictor, most residual samples are closer to zero than the original samples, while still allowing for a small part of the samples to be much larger.

**Blocking**(see). The input is split up into many contiguous blocks. **Interchannel Decorrelation**(see). In the case of stereo streams, the FLAC format allows for transforming the left-right signal into a mid-side signal, a left-side signal or a side-right signal to remove redundancy between channels. Choosing between any of these transformations is done independently for each block. **Prediction**(see). To remove redundancy in a signal, a predictor is stored for each subblock or its transformation as formed in the previous step. A predictor consists of a simple mathematical description that can be used, as the name implies, to predict a certain sample from the samples that preceded it. As this prediction is rarely exact, the error of this prediction is passed on to the next stage. The predictor of each subblock is completely independent from other subblocks. Since the methods of prediction are known to both the encoder and decoder, only the parameters of the predictor need to be included in the compressed stream. If no usable predictor can be found for a certain subblock, the signal is stored uncompressed and the next stage is skipped. **Residual Coding**(see). As the predictor does not describe the signal exactly, the difference between the original signal and the predicted signal (called the error or residual signal) is coded losslessly. If the predictor is effective, the residual signal will require fewer bits per sample than the original signal. FLAC uses Rice coding, a subset of Golomb coding, with either 4-bit or 5-bit parameters to code the residual signal.

**Independent**. All channels are coded independently. All non-stereo files MUST be encoded this way.**Mid-side**. A left and right subblock are converted to mid and side subframes. To calculate a sample for a mid subframe, the corresponding left and right samples are summed and the result is shifted right by 1 bit. To calculate a sample for a side subframe, the corresponding right sample is subtracted from the corresponding left sample. On decoding, all mid channel samples have to be shifted left by 1 bit. Also, if a side channel sample is odd, 1 has to be added to the corresponding mid channel sample after it has been shifted left by one bit. To reconstruct the left channel, the corresponding samples in the mid and side subframes are added and the result shifted right by 1 bit, while for the right channel the side channel has to be subtracted from the mid channel and the result shifted right by 1 bit.**Left-side**. The left subblock is coded and the left and right subblocks are used to code a side subframe. The side subframe is constructed in the same way as for mid-side. To decode, the right subblock is restored by subtracting the samples in the side subframe from the corresponding samples in the the left subframe.**Side-right**. The left and right subblocks are used to code a side subframe and the right subblock is coded. The side subframe is constructed in the same way as for mid-side. To decode, the left subblock is restored by adding the samples in the side subframe to the corresponding samples in the right subframe.

**Verbatim**. Samples are stored directly, without any modeling. This method is used for inputs with little correlation, like white noise. Since the raw signal is not actually passed through the residual coding stage (it is added to the stream 'verbatim'), this method is different from using a zero-order fixed predictor.**Constant**. A single sample value is stored. This method is used whenever a signal is pure DC ("digital silence"), i.e., a constant value throughout.**Fixed predictor**. Samples are predicted with one of five fixed (i.e., predefined) predictors, and the error of this prediction is processed by the residual coder. These fixed predictors are well suited for predicting simple waveforms. Since the predictors are fixed, no predictor coefficients are stored. From a mathematical point of view, the predictors work by extrapolating the signal from the previous samples. The number of previous samples used is equal to the predictor order. For more information, see. **Linear predictor**. Samples are predicted using past samples and a set of predictor coefficients, and the error of this prediction is processed by the residual coder. Compared to a fixed predictor, using a generic linear predictor adds overhead as predictor coefficients need to be stored. Therefore, this method of prediction is best suited for predicting more complex waveforms, where the added overhead is offset by space savings in the residual coding stage resulting from more accurate prediction. A linear predictor in FLAC has two parameters besides the predictor coefficients and the predictor order: the number of bits with which each coefficient is stored (the coefficient precision) and a prediction right shift. A prediction is formed by taking the sum of multiplying each predictor coefficient with the corresponding past sample, and dividing that sum by applying the specified right shift. For more information, see.

- When the samples that need to be stored do not all have the same value (i.e., the signal is not constant), a constant subframe cannot be used.
- When an encoder is unable to find a fixed or linear predictor for which all residual samples are representable in 32-bit signed integers as stated in
, a verbatim subframe is used.

Description | Reference |
---|---|

Metadata block type 127 | |

Minimum and maximum block sizes smaller than 16 in streaminfo metadata block | |

Sample rate bits 0b1111 | |

Uncommon blocksize 65536 | |

Predictor coefficient precision bits 0b1111 | |

Negative predictor right shift |

- One or more decoded sample values exceed the range offered by the bit depth as coded for that frame. E.g., in a frame with a bit depth of 8 bits, any samples not in the inclusive range from -128 to 127 are not valid.
- The number of wasted bits (see
) used by a subframe is such that the bit depth of that subframe (see for a description of subframe bit depth) equals zero or is negative. - A frame header CRC (see
) or frame footer CRC (see ) does not validate. - One of the forbidden bit patterns described in
above is used.

- The sample rate bits (see
) in the frame header MUST be 0b0001-0b1110, i.e., the frame header MUST NOT refer to the streaminfo metadata block to describe the sample rate. - The bit depth bits (see
) in the frame header MUST be 0b001-0b111, i.e., the frame header MUST NOT refer to the streaminfo metadata block to describe the bit depth. - The stream MUST NOT contain blocks with more than 16384 interchannel samples, i.e., the maximum block size must not be larger than 16384.
- Audio with a sample rate less than or equal to 48000 Hz MUST NOT be contained in blocks with more than 4608 interchannel samples, i.e., the maximum block size used for this audio must not be larger than 4608.
- Linear prediction subframes (see
) containing audio with a sample rate less than or equal to 48000 Hz MUST have a predictor order less than or equal to 12, i.e., the subframe type bits in the subframe header (see ) MUST NOT be 0b101100-0b111111. - The Rice partition order (see
) MUST be less than or equal to 8. - The channel ordering MUST be equal to one defined in
, i.e., the FLAC file MUST NOT need a WAVEFORMATEXTENSIBLE_CHANNEL_MASK tag to describe the channel ordering. See for details.

Value | Metadata block type |
---|---|

0 | Streaminfo |

1 | Padding |

2 | Application |

3 | Seektable |

4 | Vorbis comment |

5 | Cuesheet |

6 | Picture |

7 - 126 | reserved |

127 | forbidden, to avoid confusion with a frame sync code |

Data | Description |
---|---|

u(16) |
The minimum block size (in samples) used in the stream, excluding the last block. |

u(16) |
The maximum block size (in samples) used in the stream. |

u(24) |
The minimum frame size (in bytes) used in the stream. A value of 0 signifies that the value is not known. |

u(24) |
The maximum frame size (in bytes) used in the stream. A value of 0 signifies that the value is not known. |

u(20) |
Sample rate in Hz. |

u(3) |
(number of channels)-1. FLAC supports from 1 to 8 channels. |

u(5) |
(bits per sample)-1. FLAC supports from 4 to 32 bits per sample. |

u(36) |
Total number of interchannel samples in the stream. A value of zero here means the number of total samples is unknown. |

u(128) |
MD5 checksum of the unencoded audio data. This allows the decoder to determine if an error exists in the audio data even when, despite the error, the bitstream itself is valid. A value of 0 signifies that the value is not known. |

Data | Description |
---|---|

u(n) |
n '0' bits (n MUST be a multiple of 8, i.e., a whole number of bytes, and MAY be zero). n is 8 times the size described in the metadata block header. |

Data | Description |
---|---|

u(32) |
Registered application ID. |

u(n) |
Application data (n MUST be a multiple of 8, i.e., a whole number of bytes) n is 8 times the size described in the metadata block header, minus the 32 bits already used for the application ID. |

Data | Description |
---|---|

Seekpoints | Zero or more seek points as defined in |

Data | Description |
---|---|

u(64) |
Sample number of the first sample in the target frame, or 0xFFFFFFFFFFFFFFFF for a placeholder point. |

u(64) |
Offset (in bytes) from the first byte of the first frame header to the first byte of the target frame's header. |

u(16) |
Number of samples in the target frame. |

- For placeholder points, the second and third field values are undefined.
- Seek points within a table MUST be sorted in ascending order by sample number.
- Seek points within a table MUST be unique by sample number, with the exception of placeholder points.
- The previous two notes imply that there MAY be any number of placeholder points, but they MUST all occur at the end of the table.
- The sample offsets are those of an unmuxed FLAC stream. The offsets MUST NOT be updated on muxing to reflect the new offsets of FLAC frames in a container.

- Title: name of the current work.
- Artist: name of the artist generally responsible for the current work. For orchestral works, this is usually the composer; otherwise, it is often the performer.
- Album: name of the collection the current work belongs to.

Bit number | Channel description |
---|---|

0 | Front left |

1 | Front right |

2 | Front center |

3 | Low-frequency effects (LFE) |

4 | Back left |

5 | Back right |

6 | Front left of center |

7 | Front right of center |

8 | Back center |

9 | Side left |

10 | Side right |

11 | Top center |

12 | Top front left |

13 | Top front center |

14 | Top front right |

15 | Top rear left |

16 | Top rear center |

17 | Top rear right |

- If a file has a single channel, being a LFE channel, the Vorbis comment field is WAVEFORMATEXTENSIBLE_CHANNEL_MASK=0x8.
- If a file has four channels, being front left, front right, top front left, and top front right, the Vorbis comment field is WAVEFORMATEXTENSIBLE_CHANNEL_MASK=0x5003.
- If an input has four channels, being back center, top front center, front center, and top rear center in that order, they have to be reordered to front center, back center, top front center and top rear center. The Vorbis comment field added is WAVEFORMATEXTENSIBLE_CHANNEL_MASK=0x12104.

Data | Description |
---|---|

u(128*8) |
Media catalog number, in ASCII printable characters 0x20-0x7E. |

u(64) |
Number of lead-in samples. |

u(1) |
1 if the cuesheet corresponds to a CD-DA, else 0. |

u(7+258*8) |
Reserved. All bits MUST be set to zero. |

u(8) |
Number of tracks in this cuesheet. |

Cuesheet tracks | A number of structures as specified in |

Data | Description |
---|---|

u(64) |
Track offset of the first index point in samples, relative to the beginning of the FLAC audio stream. |

u(8) |
Track number. |

u(12*8) |
Track ISRC. |

u(1) |
The track type: 0 for audio, 1 for non-audio. This corresponds to the CD-DA Q-channel control bit 3. |

u(1) |
The pre-emphasis flag: 0 for no pre-emphasis, 1 for pre-emphasis. This corresponds to the CD-DA Q-channel control bit 5. |

u(6+13*8) |
Reserved. All bits MUST be set to zero. |

u(8) |
The number of track index points. |

Cuesheet track index points | For all tracks except the lead-out track, a number of structures as specified in |

Data | Description |
---|---|

u(64) |
Offset in samples, relative to the track offset, of the index point. |

u(8) |
The track index point number. |

u(3*8) |
Reserved. All bits MUST be set to zero. |

Data | Description |
---|---|

u(32) |
The picture type according to next table |

u(32) |
The length of the media type string in bytes. |

u(n*8) |
The media type string as specified by --> to signify that the data part is a URI of the picture instead of the picture data itself. This field must be in printable ASCII characters 0x20-0x7E. |

u(32) |
The length of the description string in bytes. |

u(n*8) |
The description of the picture, in UTF-8. |

u(32) |
The width of the picture in pixels. |

u(32) |
The height of the picture in pixels. |

u(32) |
The color depth of the picture in bits per pixel. |

u(32) |
For indexed-color pictures (e.g., GIF), the number of colors used, or 0 for non-indexed pictures. |

u(32) |
The length of the picture data in bytes. |

u(n*8) |
The binary picture data. |

Value | Picture type |
---|---|

0 | Other |

1 | PNG file icon of 32x32 pixels, see |

2 | General file icon |

3 | Front cover |

4 | Back cover |

5 | Liner notes page |

6 | Media label (e.g., CD, Vinyl or Cassette label) |

7 | Lead artist, lead performer, or soloist |

8 | Artist or performer |

9 | Conductor |

10 | Band or orchestra |

11 | Composer |

12 | Lyricist or text writer |

13 | Recording location |

14 | During recording |

15 | During performance |

16 | Movie or video screen capture |

17 | A bright colored fish |

18 | Illustration |

19 | Band or artist logotype |

20 | Publisher or studio logotype |

- The URI can be either in absolute or relative form. If an URI is in relative form, it is related to the URI of the FLAC content processed.
- Applications MUST obtain explicit user approval to retrieve images via remote protocols and to retrieve local images not located in the same directory as the FLAC file being processed.
- Applications supporting linked images MUST handle unavailability of URIs gracefully. They MAY report unavailability to the user.
- Applications MAY reject processing URIs for any reason, in particular for security or privacy reasons.

Value | Block size |
---|---|

0b0000 | reserved |

0b0001 | 192 |

0b0010 - 0b0101 | 144 * (2^v), i.e., 576, 1152, 2304, or 4608 |

0b0110 | uncommon block size minus 1 stored as an 8-bit number |

0b0111 | uncommon block size minus 1 stored as a 16-bit number |

0b1000 - 0b1111 | 2^v, i.e., 256, 512, 1024, 2048, 4096, 8192, 16384, or 32768 |

Value | Sample rate |
---|---|

0b0000 | sample rate only stored in the streaminfo metadata block |

0b0001 | 88.2 kHz |

0b0010 | 176.4 kHz |

0b0011 | 192 kHz |

0b0100 | 8 kHz |

0b0101 | 16 kHz |

0b0110 | 22.05 kHz |

0b0111 | 24 kHz |

0b1000 | 32 kHz |

0b1001 | 44.1 kHz |

0b1010 | 48 kHz |

0b1011 | 96 kHz |

0b1100 | uncommon sample rate in kHz stored as an 8-bit number |

0b1101 | uncommon sample rate in Hz stored as a 16-bit number |

0b1110 | uncommon sample rate in Hz divided by 10, stored as a 16-bit number |

0b1111 | forbidden |

Value | Channels |
---|---|

0b0000 | 1 channel: mono |

0b0001 | 2 channels: left, right |

0b0010 | 3 channels: left, right, center |

0b0011 | 4 channels: front left, front right, back left, back right |

0b0100 | 5 channels: front left, front right, front center, back/surround left, back/surround right |

0b0101 | 6 channels: front left, front right, front center, LFE, back/surround left, back/surround right |

0b0110 | 7 channels: front left, front right, front center, LFE, back center, side left, side right |

0b0111 | 8 channels: front left, front right, front center, LFE, back left, back right, side left, side right |

0b1000 | 2 channels, left, right, stored as left/side stereo |

0b1001 | 2 channels, left, right, stored as right/side stereo |

0b1010 | 2 channels, left, right, stored as mid/side stereo |

0b1011 - 0b1111 | reserved |

Value | Bit depth |
---|---|

0b000 | bit depth only stored in the streaminfo metadata block |

0b001 | 8 bits per sample |

0b010 | 12 bits per sample |

0b011 | reserved |

0b100 | 16 bits per sample |

0b101 | 20 bits per sample |

0b110 | 24 bits per sample |

0b111 | 32 bits per sample |

Number range (hexadecimal) | Octet sequence (binary) |
---|---|

0000 0000 0000 - 0000 0000 007F |
0xxxxxxx |

0000 0000 0080 - 0000 0000 07FF |
110xxxxx 10xxxxxx |

0000 0000 0800 - 0000 0000 FFFF |
1110xxxx 10xxxxxx 10xxxxxx |

0000 0001 0000 - 0000 001F FFFF |
11110xxx 10xxxxxx 10xxxxxx 10xxxxxx |

0000 0020 0000 - 0000 03FF FFFF |
111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx |

0000 0400 0000 - 0000 7FFF FFFF |
1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx |

0000 8000 0000 - 000F FFFF FFFF |
11111110 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx |

Value | Subframe type |
---|---|

0b000000 | Constant subframe |

0b000001 | Verbatim subframe |

0b000010 - 0b000111 | reserved |

0b001000 - 0b001100 | Subframe with a fixed predictor of order v-8, i.e., 0, 1, 2, 3 or 4 |

0b001101 - 0b011111 | reserved |

0b100000 - 0b111111 | Subframe with a linear predictor of order v-31, i.e., 1 through 32 (inclusive) |

Order | Prediction | Derivation |
---|---|---|

0 | 0 | N/A |

1 | a(n-1) | N/A |

2 | 2 * a(n-1) - a(n-2) | a(n-1) + a'(n-1) |

3 | 3 * a(n-1) - 3 * a(n-2) + a(n-3) | a(n-1) + a'(n-1) + a''(n-1) |

4 | 4 * a(n-1) - 6 * a(n-2) + 4 * a(n-3) - a(n-4) | a(n-1) + a'(n-1) + a''(n-1) + a'''(n-1) |

- n is the number of the sample being predicted.
- a(n) is the sample being predicted.
- a(n-1) is the sample before the one being predicted.
- a'(n-1) is the difference between the previous sample and the sample before that, i.e., a(n-1) - a(n-2). This is the closest available first-order discrete derivative.
- a''(n-1) is a'(n-1) - a'(n-2) or the closest available second-order discrete derivative.
- a'''(n-1) is a''(n-1) - a''(n-2) or the closest available third-order discrete derivative.

Data | Description |
---|---|

s(n) |
Unencoded warm-up samples (n = subframe's bits per sample * predictor order). |

Coded residual | Coded residual as defined in |

Data | Description |
---|---|

s(n) |
Unencoded warm-up samples (n = subframe's bits per sample * lpc order). |

u(4) |
(Predictor coefficient precision in bits)-1 (NOTE: 0b1111 is forbidden). |

s(5) |
Prediction right shift needed in bits. |

s(n) |
Predictor coefficients (n = predictor coefficient precision * lpc order). |

Coded residual | Coded residual as defined in |

Value | Description |
---|---|

0b00 | partitioned Rice code with 4-bit parameters |

0b01 | partitioned Rice code with 5-bit parameters |

0b10 - 0b11 | reserved |

Data | Description |
---|---|

5 bytes | Bytes 0x7F 0x46 0x4C 0x41 0x43 (as also defined by |

2 bytes | Version number of the FLAC-in-Ogg mapping. These bytes are 0x01 0x00, meaning version 1.0 of the mapping. |

2 bytes | Number of header packets (excluding the first header packet) as an unsigned number coded big-endian. |

4 bytes | The fLaC signature |

4 bytes | A metadata block header for the streaminfo block |

34 bytes | A streaminfo metadata block |

Application ID | ASCII rendition (if available) | Description | Specification | Change controller |
---|---|---|---|---|

0x41544348 | ATCH | FlacFile | IETF | |

0x42534F4C | BSOL | beSolo | IETF | |

0x42554753 | BUGS | Bugs Player | IETF | |

0x43756573 | Cues | GoldWave cue points | IETF | |

0x46696361 | Fica | CUE Splitter | IETF | |

0x46746F6C | Ftol | flac-tools | IETF | |

0x4D4F5442 | MOTB | MOTB MetaCzar | IETF | |

0x4D505345 | MPSE | MP3 Stream Editor | IETF | |

0x4D754D4C | MuML | MusicML: Music Metadata Language | IETF | |

0x52494646 | RIFF | Sound Devices RIFF chunk storage | IETF | |

0x5346464C | SFFL | Sound Font FLAC | IETF | |

0x534F4E59 | SONY | Sony Creative Software | IETF | |

0x5351455A | SQEZ | flacsqueeze | IETF | |

0x54745776 | TtWv | TwistedWave | IETF | |

0x55495453 | UITS | UITS Embedding tools | IETF | |

0x61696666 | aiff | FLAC AIFF chunk storage | IETF | |

0x696D6167 | imag | flac-image | IETF | |

0x7065656D | peem | Parseable Embedded Extensible Metadata | IETF | |

0x71667374 | qfst | QFLAC Studio | IETF | |

0x72696666 | riff | FLAC RIFF chunk storage | IETF | |

0x74756E65 | tune | TagTuner | IETF | |

0x773634C0 | w64 | FLAC Wave64 chunk storage | IETF | |

0x78626174 | xbat | XBAT | IETF | |

0x786D6364 | xmcd | xmcd | IETF |

- A. J. Robinson for his work on Shorten; his paper (see
) is a good starting point on some of the basic methods used by FLAC. FLAC trivially extends and improves the fixed predictors, LPC coefficient quantization, and Rice coding used in Shorten. - S. W. Golomb and Robert F. Rice; their universal codes are used by FLAC's entropy coder, see
. - N. Levinson and J. Durbin; the FLAC reference encoder (see
) uses an algorithm developed and refined by them for determining the LPC coefficients from the autocorrelation coefficients, see . - And of course, Claude Shannon, see
.

Order | Calculation of residual | Sample values summed | Extra bits |
---|---|---|---|

0 | a(n) | 1 | 0 |

1 | a(n) - a(n-1) | 2 | 1 |

2 | a(n) - 2 * a(n-1) + a(n-2) | 4 | 2 |

3 | a(n) - 3 * a(n-1) + 3 * a(n-2) - a(n-3) | 8 | 3 |

4 | a(n) - 4 * a(n-1) + 6 * a(n-2) - 4 * a(n-3) + a(n-4) | 16 | 4 |

- n is the number of the sample being predicted.
- a(n) is the sample being predicted.
- a(n-1) is the sample before the one being predicted, a(n-2) is the sample before that, etc.

- DVD-Audio has the possibility to store 20 bit PCM audio.
- DAT and DV can store 12 bit PCM audio.
- NICAM-728 samples at 14 bit, which is companded to 10 bit.
- 8-bit µ-law can be losslessly converted to 14 bit (Linear) PCM.
- 8-bit A-law can be losslessly converted to 13 bit (Linear) PCM.

Start | Length | Contents | Description |
---|---|---|---|

0x00+0 | 4 bytes | 0x664C6143 | fLaC |

0x04+0 | 1 bit | 0b1 | Last metadata block |

0x04+1 | 7 bits | 0b0000000 | Streaminfo metadata block |

0x05+0 | 3 bytes | 0x000022 | Length 34 byte |

Start | Length | Contents | Description |
---|---|---|---|

0x08+0 | 2 bytes | 0x1000 | Min. block size 4096 |

0x0a+0 | 2 bytes | 0x1000 | Max. block size 4096 |

0x0c+0 | 3 bytes | 0x00000f | Min. frame size 15 byte |

0x0f+0 | 3 bytes | 0x00000f | Max. frame size 15 byte |

0x12+0 | 20 bits | 0x0ac4, 0b0100 | Sample rate 44100 hertz |

0x14+4 | 3 bits | 0b001 | 2 channels |

0x14+7 | 5 bits | 0b01111 | Sample bit depth 16 |

0x15+4 | 36 bits | 0b0000, 0x00000001 | Total no. of samples 1 |

0x1a | 16 bytes | (...) | MD5 checksum |

Start | Length | Contents | Description |
---|---|---|---|

0x2a+0 | 15 bits | 0xff, 0b1111100 | frame sync |

0x2b+7 | 1 bit | 0b0 | blocking strategy |

0x2c+0 | 4 bits | 0b0110 | 8-bit block size further down |

0x2c+4 | 4 bits | 0b1001 | sample rate 44.1 kHz |

0x2d+0 | 4 bits | 0b0001 | stereo, no decorrelation |

0x2d+4 | 3 bits | 0b100 | bit depth 16 bit |

0x2d+7 | 1 bit | 0b0 | mandatory 0 bit |

0x2e+0 | 1 byte | 0x00 | frame number 0 |

0x2f+0 | 1 byte | 0x00 | block size 1 |

0x30+0 | 1 byte | 0xbf | frame header CRC |

Start | Length | Contents | Description |
---|---|---|---|

0x31+0 | 1 bit | 0b0 | mandatory 0 bit |

0x31+1 | 6 bits | 0b000001 | verbatim subframe |

0x31+7 | 1 bit | 0b1 | wasted bits used |

0x32+0 | 2 bits | 0b01 | 2 wasted bits used |

0x32+2 | 14 bits | 0b011000, 0xfd | 14-bit unencoded sample |

Start | Length | Contents | Description |
---|---|---|---|

0x34+0 | 1 bit | 0b0 | mandatory 0 bit |

0x34+1 | 6 bits | 0b000001 | verbatim subframe |

0x34+7 | 1 bit | 0b1 | wasted bits used |

0x35+0 | 4 bits | 0b0001 | 4 wasted bits used |

0x35+4 | 12 bits | 0b0010, 0x8b | 12-bit unencoded sample |

Start | Length | Contents | Description |
---|---|---|---|

0x04+0 | 1 bit | 0b0 | Not the last metadata block |

0x08+0 | 2 bytes | 0x0010 | Min. block size 16 |

0x0a+0 | 2 bytes | 0x0010 | Max. block size 16 |

0x0c+0 | 3 bytes | 0x000017 | Min. frame size 23 byte |

0x0f+0 | 3 bytes | 0x000044 | Max. frame size 68 byte |

0x15+4 | 36 bits | 0b0000, 0x00000013 | Total no. of samples 19 |

0x1a | 16 bytes | (...) | MD5 checksum |

Start | Length | Contents | Description |
---|---|---|---|

0x2a+0 | 1 bit | 0b0 | Not the last metadata block |

0x2a+1 | 7 bits | 0b0000011 | Seektable metadata block |

0x2b+0 | 3 bytes | 0x000012 | Length 18 byte |

0x2e+0 | 8 bytes | 0x0000000000000000 | Seekpoint to sample 0 |

0x36+0 | 8 bytes | 0x0000000000000000 | Seekpoint to offset 0 |

0x3e+0 | 2 bytes | 0x0010 | Seekpoint to block size 16 |

Start | Length | Contents | Description |
---|---|---|---|

0x40+0 | 1 bit | 0b0 | Not the last metadata block |

0x40+1 | 7 bits | 0b0000100 | Vorbis comment metadata block |

0x41+0 | 3 bytes | 0x00003a | Length 58 byte |

0x44+0 | 4 bytes | 0x20000000 | Vendor string length 32 byte |

0x48+0 | 32 bytes | (...) | Vendor string |

0x68+0 | 4 bytes | 0x01000000 | Number of fields 1 |

0x6c+0 | 4 bytes | 0x0e000000 | Field length 14 byte |

0x70+0 | 14 bytes | (...) | Field contents |

Start | Length | Contents | Description |
---|---|---|---|

0x7e+0 | 1 bit | 0b1 | Last metadata block |

0x7e+1 | 7 bits | 0b0000001 | Padding metadata block |

0x7f+0 | 3 bytes | 0x000006 | Length 6 byte |

0x82+0 | 6 bytes | 0x000000000000 | Padding bytes |

Start | Length | Contents | Description |
---|---|---|---|

0x88+0 | 15 bits | 0xff, 0b1111100 | frame sync |

0x89+7 | 1 bit | 0b0 | blocking strategy |

0x8a+0 | 4 bits | 0b0110 | 8-bit block size further down |

0x8a+4 | 4 bits | 0b1001 | sample rate 44.1 kHz |

0x8b+0 | 4 bits | 0b1001 | side-right stereo |

0x8b+4 | 3 bits | 0b100 | bit depth 16 bit |

0x8b+7 | 1 bit | 0b0 | mandatory 0 bit |

0x8c+0 | 1 byte | 0x00 | frame number 0 |

0x8d+0 | 1 byte | 0x0f | block size 16 |

0x8e+0 | 1 byte | 0x99 | frame header CRC |

Start | Length | Contents | Description |
---|---|---|---|

0x8f+0 | 1 bit | 0b0 | mandatory 0 bit |

0x8f+1 | 6 bits | 0b001001 | fixed subframe, 1st order |

0x8f+7 | 1 bit | 0b0 | no wasted bits used |

0x90+0 | 17 bits | 0x0867, 0b0 | unencoded warm-up sample |

Start | Length | Contents | Description |
---|---|---|---|

0x92+1 | 2 bits | 0b00 | Rice code with 4-bit parameter |

0x92+3 | 4 bits | 0b0000 | Partition order 0 |

0x92+7 | 4 bits | 0b1011 | Rice parameter 11 |

0x93+3 | 4 bits | 0b0001 | Quotient 3 |

0x93+7 | 11 bits | 0b00011110100 | Remainder 244 |

0x95+2 | 2 bits | 0b01 | Quotient 1 |

0x95+4 | 11 bits | 0b01000100001 | Remainder 545 |

0x96+7 | 2 bits | 0b01 | Quotient 1 |

0x97+1 | 11 bits | 0b00110011000 | Remainder 408 |

0x98+4 | 1 bit | 0b1 | Quotient 0 |

0x98+5 | 11 bits | 0b11101011101 | Remainder 1885 |

0x9a+0 | 1 bit | 0b1 | Quotient 0 |

0x9a+1 | 11 bits | 0b11101110000 | Remainder 1904 |

0x9b+4 | 1 bit | 0b1 | Quotient 0 |

0x9b+5 | 11 bits | 0b10101101111 | Remainder 1391 |

0x9d+0 | 1 bit | 0b1 | Quotient 0 |

0x9d+1 | 11 bits | 0b11000000000 | Remainder 1536 |

0x9e+4 | 1 bit | 0b1 | Quotient 0 |

0x9e+5 | 11 bits | 0b10000010111 | Remainder 1047 |

0xa0+0 | 1 bit | 0b1 | Quotient 0 |

0xa0+1 | 11 bits | 0b10010101110 | Remainder 1198 |

0xa1+4 | 1 bit | 0b1 | Quotient 0 |

0xa1+5 | 11 bits | 0b01100100001 | Remainder 801 |

0xa3+0 | 13 bits | 0b0000000000001 | Quotient 12 |

0xa4+5 | 11 bits | 0b11011100111 | Remainder 1767 |

0xa6+0 | 1 bit | 0b1 | Quotient 0 |

0xa6+1 | 11 bits | 0b01001110111 | Remainder 631 |

0xa7+4 | 1 bit | 0b1 | Quotient 0 |

0xa7+5 | 11 bits | 0b01000100100 | Remainder 548 |

0xa9+0 | 1 bit | 0b1 | Quotient 0 |

0xa9+1 | 11 bits | 0b01000010101 | Remainder 533 |

0xaa+4 | 1 bit | 0b1 | Quotient 0 |

0xaa+5 | 11 bits | 0b00100001100 | Remainder 268 |

Quotient | Remainder | Zig-zag encoded | Residual sample value |
---|---|---|---|

3 | 244 | 6388 | 3194 |

1 | 545 | 2593 | -1297 |

1 | 408 | 2456 | 1228 |

0 | 1885 | 1885 | -943 |

0 | 1904 | 1904 | 952 |

0 | 1391 | 1391 | -696 |

0 | 1536 | 1536 | 768 |

0 | 1047 | 1047 | -524 |

0 | 1198 | 1198 | 599 |

0 | 801 | 801 | -401 |

12 | 1767 | 26343 | -13172 |

0 | 631 | 631 | -316 |

0 | 548 | 548 | 274 |

0 | 533 | 533 | -267 |

0 | 268 | 268 | 134 |

Residual | Sample value |
---|---|

(warm-up) | 4302 |

3194 | 7496 |

-1297 | 6199 |

1228 | 7427 |

-943 | 6484 |

952 | 7436 |

-696 | 6740 |

768 | 7508 |

-524 | 6984 |

599 | 7583 |

-401 | 7182 |

-13172 | -5990 |

-316 | -6306 |

274 | -6032 |

-267 | -6299 |

134 | -6165 |

Subframe 1 | Subframe 2 | Left | Right |
---|---|---|---|

4302 | 6070 | 10372 | 6070 |

7496 | 10545 | 18041 | 10545 |

6199 | 8743 | 14942 | 8743 |

7427 | 10449 | 17876 | 10449 |

6484 | 9143 | 15627 | 9143 |

7436 | 10463 | 17899 | 10463 |

6740 | 9502 | 16242 | 9502 |

7508 | 10569 | 18077 | 10569 |

6984 | 9840 | 16824 | 9840 |

7583 | 10680 | 18263 | 10680 |

7182 | 10113 | 17295 | 10113 |

-5990 | -8428 | -14418 | -8428 |

-6306 | -8895 | -15201 | -8895 |

-6032 | -8476 | -14508 | -8476 |

-6299 | -8896 | -15195 | -8896 |

-6165 | -8653 | -14818 | -8653 |

Start | Length | Contents | Description |
---|---|---|---|

0xcc+0 | 15 bits | 0xff, 0b1111100 | frame sync |

0xcd+7 | 1 bit | 0b0 | blocking strategy |

0xce+0 | 4 bits | 0b0110 | 8-bit block size further down |

0xce+4 | 4 bits | 0b1001 | sample rate 44.1 kHz |

0xcf+0 | 4 bits | 0b0001 | stereo, no decorrelation |

0xcf+4 | 3 bits | 0b100 | bit depth 16 bit |

0xcf+7 | 1 bit | 0b0 | mandatory 0 bit |

0xd0+0 | 1 byte | 0x01 | frame number 1 |

0xd1+0 | 1 byte | 0x02 | block size 3 |

0xd2+0 | 1 byte | 0xa4 | frame header CRC |

Start | Length | Contents | Description |
---|---|---|---|

0xd3+0 | 1 bit | 0b0 | mandatory 0 bit |

0xd3+1 | 6 bits | 0b000001 | verbatim subframe |

0xd3+7 | 1 bit | 0b0 | no wasted bits used |

0xd4+0 | 16 bits | 0xc382 | 16-bit unencoded sample |

0xd6+0 | 16 bits | 0xc40b | 16-bit unencoded sample |

0xd8+0 | 16 bits | 0xc14a | 16-bit unencoded sample |

Start | Length | Contents | Description |
---|---|---|---|

0xda+0 | 1 bit | 0b0 | mandatory 0 bit |

0xda+1 | 6 bits | 0b000001 | verbatim subframe |

0xda+7 | 1 bit | 0b1 | wasted bits used |

0xdb+0 | 1 bit | 0b1 | 1 wasted bit used |

0xdb+1 | 15 bits | 0b110111001001000 | 15-bit unencoded sample |

0xdd+0 | 15 bits | 0b110111010000001 | 15-bit unencoded sample |

0xde+7 | 15 bits | 0b110110110011111 | 15-bit unencoded sample |

Start | Length | Contents | Description |
---|---|---|---|

0x0c+0 | 3 bytes | 0x00001f | Min. frame size 31 byte |

0x0f+0 | 3 bytes | 0x00001f | Max. frame size 31 byte |

0x12+0 | 20 bits | 0x07d0, 0x0000 | Sample rate 32000 hertz |

0x14+4 | 3 bits | 0b000 | 1 channel |

0x14+7 | 5 bits | 0b00111 | Sample bit depth 8 bit |

0x15+4 | 36 bits | 0b0000, 0x00000018 | Total no. of samples 24 |

0x1a | 16 bytes | (...) | MD5 checksum |

Start | Length | Contents | Description |
---|---|---|---|

0x2a+0 | 15 bits | 0xff, 0b1111100 | Frame sync |

0x2b+7 | 1 bit | 0b0 | blocking strategy |

0x2c+0 | 4 bits | 0b0110 | 8-bit block size further down |

0x2c+4 | 4 bits | 0b1000 | Sample rate 32 kHz |

0x2d+0 | 4 bits | 0b0000 | Mono audio (1 channel) |

0x2d+4 | 3 bits | 0b001 | Bit depth 8 bit |

0x2d+7 | 1 bit | 0b0 | Mandatory 0 bit |

0x2e+0 | 1 byte | 0x00 | Frame number 0 |

0x2f+0 | 1 byte | 0x17 | Block size 24 |

0x30+0 | 1 byte | 0xe9 | Frame header CRC |

Start | Length | Contents | Description |
---|---|---|---|

0x31+0 | 1 bit | 0b0 | Mandatory 0 bit |

0x31+1 | 6 bits | 0b100010 | Linear prediction subframe, 3rd order |

0x31+7 | 1 bit | 0b0 | No wasted bits used |

0x32+0 | 8 bits | 0x00 | Unencoded warm-up sample 0 |

0x33+0 | 8 bits | 0x4f | Unencoded warm-up sample 79 |

0x34+0 | 8 bits | 0x6f | Unencoded warm-up sample 111 |

0x35+0 | 4 bits | 0b0011 | Coefficient precision 4 bit |

0x35+4 | 5 bits | 0b00010 | Prediction right shift 2 |

0x36+1 | 4 bits | 0b0111 | Predictor coefficient 7 |

0x36+5 | 4 bits | 0b1010 | Predictor coefficient -6 |

0x37+1 | 4 bits | 0b0010 | Predictor coefficient 2 |

Start | Length | Contents | Description |
---|---|---|---|

0x37+5 | 2 bits | 0b00 | Rice-coded residual, 4-bit parameter |

0x37+7 | 4 bits | 0b0010 | Partition order 2 |

0x38+3 | 4 bits | 0b0011 | Rice parameter 3 |

0x38+7 | 1 bit | 0b1 | Quotient 0 |

0x39+0 | 3 bits | 0b110 | Remainder 6 |

0x39+3 | 1 bit | 0b1 | Quotient 0 |

0x39+4 | 3 bits | 0b001 | Remainder 1 |

0x39+7 | 4 bits | 0b0001 | Quotient 3 |

0x3a+3 | 3 bits | 0b001 | Remainder 1 |

0x3a+6 | 4 bits | 0b1111 | No Rice parameter, escape code |

0x3b+2 | 5 bits | 0b00101 | Partition encoded with 5 bits |

0x3b+7 | 5 bits | 0b10110 | Residual -10 |

0x3c+4 | 5 bits | 0b11010 | Residual -6 |

0x3d+1 | 5 bits | 0b00010 | Residual 2 |

0x3d+6 | 5 bits | 0b01000 | Residual 8 |

0x3e+3 | 5 bits | 0b01000 | Residual 8 |

0x3f+0 | 5 bits | 0b00110 | Residual 6 |

0x3f+5 | 4 bits | 0b0010 | Rice parameter 2 |

0x40+1 | 22 bits | (...) | Residual partition 3 |

0x42+7 | 4 bits | 0b0001 | Rice parameter 1 |

0x43+3 | 23 bits | (...) | Residual partition 4 |

Residual | Predictor w/o shift | Predictor | Sample value |
---|---|---|---|

(warm-up) | N/A | N/A | 0 |

(warm-up) | N/A | N/A | 79 |

(warm-up) | N/A | N/A | 111 |

3 | 303 | 75 | 78 |

-1 | 38 | 9 | 8 |

-13 | -190 | -48 | -61 |

-10 | -319 | -80 | -90 |

-6 | -248 | -62 | -68 |

2 | -58 | -15 | -13 |

8 | 137 | 34 | 42 |

8 | 236 | 59 | 67 |

6 | 191 | 47 | 53 |

0 | 53 | 13 | 13 |

-3 | -93 | -24 | -27 |

-5 | -161 | -41 | -46 |

-4 | -134 | -34 | -38 |

-1 | -44 | -11 | -12 |

1 | 52 | 13 | 14 |

1 | 94 | 23 | 24 |

4 | 60 | 15 | 19 |

2 | 17 | 4 | 6 |

2 | -24 | -6 | -4 |

2 | -26 | -7 | -5 |

0 | 1 | 0 | 0 |