Data Storage(II)-計算機概論筆記
一定愛配于天立教授開放課程食用:課程連結
Data Storage(II)
Fraction in binary
- The rules is the same
Float-point notation and decoding
- 8 bits representation
- Ex (Use the table of binary): 10010101 => 1(001)(0101) =>
(
) ( + )=-
- Ex (Use the table of binary): 10010101 => 1(001)(0101) =>
(
- On the widely used 64-bit computers,the exponent takes 11 bits,and the mantissa takes 52 bits
Truncation error
- The precision is beyond the limitation of mantissa.
- Ex (Use the table of binary): 2
-> 10.101(base two,fixed point) => .10101 => 0(010)(1010) => 2 ## Normalized form
- Ex (Use the table of binary): 2
- The first bit of mantissa is 1
- 0's floating-point representation is all 0
- Normalization:
- Ex (Use the table of binary): 00100011 => 0(010)(0011) =>
.0011
=> .1100 => 0(000)(1100)
- Ex (Use the table of binary): 00100011 => 0(010)(0011) =>
.0011
- IEEE normalized form
- The left-most bit in mantissa is always 1 (Ex: .0101 -> 1.0101)
- Standard normalized form is (s)(eee)(mmmm) => (
) 1.mmmm - Ex(Use the table of binary): 01100011 => (0)(110)(0011) =>
1.0011
- Ex(Use the table of binary): 01100011 => (0)(110)(0011) =>
Loss of digits
Ex (Use the table of excess):
- 4 +
+ = (01111000 + 00111000) + 00111000 -> make the exponent the same (01111000 + 01110000) + 01110000 = 4 (the result is wrong) - 4 +
+ = 01111000 + (00111000 + 00111000) = 4
Data compression
- Two types => Lossy and Lossless
- Lossless
- Run-length encoding(RLE)
- After the process of compressing ,wwwwwww is being recognized as
7w
- After the process of compressing ,wwwwwww is being recognized as
7w
- Frequency-dependent encoding => Huffman encoding
- Dictionary encoding => Adaptive dictionary encoding, LZW encoding
- Defining the value by yourself
- Run-length encoding(RLE)
- Lossy
- Relative / difference encoding
- Huffman encoding
- Ex: AAABBBAABCAAAABD
- Traditional encoding => Code book: A => 00; B => 01; C => 10; D => 11, represent in 2 bits it will be 000000010101000001100000000111
- Huffman encoding =>
- Count the occurrences: A(9); B(5); C(1); D(1)
- Build a huffman tree:
- Ex: AAABBBAABCAAAABD
- LZW encoding
- Is a kind of dictionary encoding that does not need to store the dictionary
- The concept is to generate more values to the dictionary(Usually we
use the ascii code to represent numbers and words, so no addition
dictionary is needed)
Code book: x => 1, y > 2, space => 3
- Ex: xyx xyx xyx xyx
=> 1
=> 12
=> 121
=> 1213
Then you append 1213 to the dictionary as 4
=> 12134 => 121343434
Images, audios and videos
- Images:
- GIF: 256 colors, dictionary encoding
- JPEG: Lossy / lossless encoding
- Audios:
- MP3: Lossy encoding
- Videos:
- MPEG: Lossy encoding
Communication errors
- The reason of compressing data is to remove redundency
- To correct the communication error => we add redundancy
- Error detection => instead of correcting errors, it can only
check if the errors occurs
- Applications(detection):
- ID numbers
- ISBN
- Parity code
- Applications(detection):
- Error correcting => Can correct errors to some degree
Application of error detections / corrections
- Taiwan ID:
- Convert the first English letter into a number(xy):
= x + 9y = i = 1 + 2 ... +8 - Check code =>
= 10 - (( + ) mod 10)
- ISBN-10 ISBN template: 0-273-75139
- Compute S = 0
10 + 2 9 +7 8 + 3 7 + 7 6 + 5 5 + 1 4 + 3 3 + 9 2 = 193 - M = S mod 11 = 6
- N = 11 - M =5
- If N = 10 the check code is X
- If N = 11 the check code is 0
- Otherwise, the check code is the number N
- The ISBN code 0-273-75139-5
- Parity Bits
- Making the quantity of 1s odd
- The technique is used in communication and RAID
- Parity Bits
- Compute S = 0
- An error-correcting code(ECC)
- (3, 1) repetition code (can correct 1-bit errors) -> seperate the data into groups with 1 bit in each group, and add two bits to authenticate the data
- Hamming distance
- Comparing two binary data, hamming distance is the amount of different bits.
- Error correction with hamming distance
- Maximizing hamming distance along the symbols
- Sample code book:
According to the chart, we will correct 010100 to 011100(D)
- Maximizing hamming distance along the symbols
- Generating (7, 4)Hamming code
- Title: Data Storage(II)-計算機概論筆記
- Author: Shih Jiun Lin
- Created at : 2023-09-22 16:29:15
- Updated at : 2023-01-24 01:29:36
- Link: https://shih-jiun-lin.github.io/2023/09/22/2023-01-17-Data Storage(II)/
- License: This work is licensed under CC BY-NC-SA 4.0.