Encoding Basics for Developers
In web development, various encoding methods are used to safely and efficiently transmit and store data. Let's learn about the main encoding methods.
1. Base64 Encoding
Concept
An encoding method that converts binary data to ASCII strings.
Characteristics
- Uses 64 safe ASCII characters (A-Z, a-z, 0-9, +, /)
- Converts binary data to text
- Mainly used for email attachments and image transmission on web
Usage Example
// Encoding
const text = "Hello World!";
const encoded = btoa(text);
console.log(encoded); // "SGVsbG8gV29ybGQh"
// Decoding
const decoded = atob(encoded);
console.log(decoded); // "Hello World!"
2. URL Encoding
Concept
A method that converts unsafe characters in URLs to percent encoding.
Characteristics
- Converts special characters, spaces, Unicode characters to %XX format
- Mainly used for URL parameter values
- Uses encodeURIComponent() and encodeURI() functions
Usage Example
// URL component encoding (recommended)
const param = "Hello World!";
const encoded = encodeURIComponent(param);
console.log(encoded); // "Hello%20World%21"
// Full URL encoding
const url = "https://example.com/search?q=hello world";
const encodedUrl = encodeURI(url);
console.log(encodedUrl); // "https://example.com/search?q=hello%20world"
3. HTML Entity Encoding
Concept
Encoding for safely displaying special characters in HTML.
Main Entities
- < → <
- > → >
- & → &
- " → "
- ' → '
Usage Example
<!-- Safely display user input -->
<p><script>alert('XSS')</script></p>
4. JSON Encoding
Concept
A method that converts JavaScript objects to JSON strings.
Characteristics
- Widely used for data exchange
- Uses UTF-8 encoding
- Supports nested objects and arrays
Usage Example
const data = {
name: "John Doe",
age: 30,
hobbies: ["reading", "movies"]
};
const jsonString = JSON.stringify(data);
console.log(jsonString);
// {"name":"John Doe","age":30,"hobbies":["reading","movies"]}
const parsed = JSON.parse(jsonString);
console.log(parsed.name); // "John Doe"
5. Unicode Encoding
Concept
A standard for processing multilingual characters in computers.
Main Encoding Methods
- UTF-8: Most widely used, variable length
- UTF-16: 2-byte or 4-byte
- UTF-32: Fixed 4-byte
Usage Example
// Unicode escape sequence
const unicode = "\uD55C\uAD6D"; // "한국"
console.log(unicode); // "한국"
// UTF-8 byte check
const text = "안녕하세요.";
const bytes = new TextEncoder().encode(text);
console.log(bytes); // Uint8Array(15) [236, 149, 136, 235, 133, 149, 236, 150, 188, 236, 149, 136, 236, 132, 184]
Encoding Selection Guide
Recommendations by Situation
- Email Attachments: Base64
- URL Parameters: URL Encoding
- API Data Exchange: JSON
- HTML Output: HTML Entities
- Multilingual Support: UTF-8
Security Considerations
1. Encoding ≠ Encryption
- Encoding is just data conversion, not a security feature
- Sensitive data must be encrypted
2. XSS Prevention
- Proper encoding required when outputting user input to HTML
- Use textContent instead of innerHTML
3. Prevent Encoding Attacks
- Beware of double encoding attacks
- Both input validation and output encoding are necessary
Conclusion
Choosing appropriate encoding methods and implementing them correctly is fundamental to web development. Use the right encoding for each situation to develop safe and efficient applications.