Course Code – CSA0449
Course Name – Operating Systems for Gaming
Technologies
Faculty Name: Dr R A KARTHIKA
HIGHER ORDER ASSIGNMENT
BY
CHANIKYA.M(192311193)
Designing a Custom Linker for a New Programming
Language
Overview
A custom linker is essential for managing both static and dynamic linking of libraries in a new
programming language. This document outlines the design, implementation details, optimization
strategies, and testing framework for such a linker.
The linker’s primary responsibilities include:
• Resolving symbols across object files and libraries.
• Adjusting memory addresses for program execution.
• Handling dependencies between static and dynamic libraries.
• Ensuring security and performance in the linking process.
1. Symbol Resolution and Relocation
1.1 Symbol Resolution
Symbol resolution is the process of matching symbol references in object files to their corresponding
definitions. The linker performs the following steps:
1. Symbol Table Construction:
o Parses all input object files and libraries to build a global symbol table.
o Entries include symbol names, their types (e.g., function, variable), and locations
(object file, library, or memory address).
2. Definition Matching:
o Undefined symbols in the symbol table are matched to their definitions.
o If a symbol is multiply defined, the linker raises an error or follows user-defined
resolution rules (e.g., prioritizing the first definition).
o If a symbol remains unresolved, the linker flags it as an error.
3. Conflict Resolution:
o Implements rules such as precedence based on file order, scope modifiers, or explicit
overrides specified by developers.
1.2 Relocation
Relocation involves adjusting symbol addresses for execution:
1. Base Address Assignment:
o Determines base addresses for each segment (text, data, BSS) of the input files.
o Supports fixed or relocatable base address strategies.
2. Relocation Table Processing:
o Reads relocation entries from object files.
o Adjusts addresses for:
▪ Absolute Symbols: Updates with the absolute memory address.
▪ Relative Symbols: Calculates relative offsets between symbols.
3. Output File Generation:
o Produces an executable or shared object file with updated addresses and relocation
records for runtime use.
2. Library Handling
2.1 Static Libraries (.a)
1. Indexing:
o Parses the archive index of the static library.
o Identifies which object files are needed based on unresolved symbols.
2. Selective Extraction:
o Extracts only the required object files, reducing memory and processing overhead.
3. Symbol Table Augmentation:
o Adds symbols from the extracted object files to the global symbol table.
2.2 Dynamic Libraries (.so)
1. Runtime Linking:
o Incorporates placeholders for dynamically loaded symbols.
o Links references to dynamic libraries during runtime using the operating system’s
loader.
2. Dependency Resolution:
o Analyzes dependency metadata (e.g., DT_NEEDED entries in ELF files) to recursively
resolve dependencies.
3. Version Compatibility:
o Checks version metadata (e.g., SONAME and version tags) to ensure compatibility.
o Allows multiple versions of the same library to coexist if supported by the runtime.
4. Lazy Binding:
o Uses PLT (Procedure Linkage Table) and GOT (Global Offset Table) mechanisms to
resolve symbols only when first accessed.
3. Optimization Strategies
3.1 Linking Speed
1. Incremental Linking:
o Processes only modified object files, retaining previously linked results.
o Useful during iterative development cycles.
2. Parallel Processing:
o Resolves symbols and processes relocation entries in parallel using multi-threading.
3. Optimized Parsing:
o Utilizes efficient data structures like hash maps for faster symbol lookups.
o Implements precompiled headers or intermediate linking stages.
4. Symbol Elimination:
o Identifies and removes unused symbols during linking to reduce processing
overhead.
3.2 Memory Usage
1. Intermediate File Caching:
o Stores intermediate results in disk cache to minimize redundant computations.
2. Segment Consolidation:
o Combines small code or data segments to reduce fragmentation and improve
memory locality.
3. Dynamic Memory Allocation:
o Allocates memory for linking structures dynamically to minimize peak usage.
4. Security Model
4.1 Symbol Hijacking Prevention
1. Restricted Symbol Visibility:
o Limits symbol export to only explicitly declared symbols using directives like --
exported-symbols.
2. Namespace Isolation:
o Implements namespace mechanisms to prevent symbol collisions and hijacking
across libraries.
4.2 Code Injection Prevention
1. Library Signature Verification:
o Verifies digital signatures of libraries before loading them to prevent tampering.
2. Read-Only Code Segments:
o Marks code sections as read-only in memory to prevent runtime modification.
3. Runtime Integrity Checks:
o Validates integrity of loaded libraries periodically using checksums or cryptographic
hashes.
4.3 Dynamic Linking Security
1. Address Space Layout Randomization (ASLR):
o Randomizes library load addresses to prevent predictable memory exploits.
2. Binding Restrictions:
o Limits runtime symbol binding to only required libraries.
3. Isolation of Privileged Libraries:
o Loads critical libraries into isolated address spaces to prevent unauthorized access.
5. Implementation Details
5.1 Linking Process Diagram
(Include a diagram detailing the process: input parsing, symbol resolution, relocation, and output
generation.)
5.2 Flowchart for Library Handling
(Include a detailed flowchart for static and dynamic library linking.)
5.3 Example Code Snippets
1. Parsing Object Files:
o Code to read ELF or COFF object files and extract symbol and relocation tables.
2. Relocation Processing:
o Sample function to process relocation entries and adjust addresses.
3. Version Compatibility Check:
o Example implementation to validate library versions during dynamic linking.
6. Test Cases
6.1 Functional Validation
1. Conflicting Symbols:
o Input: Object files with the same symbol defined.
o Expected: Error message indicating a conflict.
2. Undefined Symbols:
o Input: Object files with unresolved references.
o Expected: Error message for undefined symbols.
3. Version Mismatch:
o Input: Dynamic libraries with incompatible versions.
o Expected: Error message indicating version incompatibility.
6.2 Performance Metrics
1. Linking Time:
o Measure time for linking projects of different sizes (e.g., 10, 100, 1000 object files).
2. Memory Usage:
o Monitor peak memory usage during the linking process.
3. Incremental vs. Full Linking:
o Compare performance between incremental and full linking scenarios.
6.3 Security Testing
1. Symbol Hijacking Attempt:
o Input: Libraries with overlapping symbols.
o Expected: Proper resolution or rejection based on namespace rules.
2. Tampered Library:
o Input: Modified library without valid signature.
o Expected: Rejection of the library.
3. ASLR Validation:
o Input: Repeated loading of the same library.
o Expected: Different base addresses for each load.
C program :
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_SYMBOLS 100
#define MAX_FILES 10
// Symbol Table Entry
typedef struct {
char name[50];
int address;
int defined;
} Symbol;
// Global symbol table
Symbol symbolTable[MAX_SYMBOLS];
int symbolCount = 0;
// Add symbol to the global symbol table
void addSymbol(const char *name, int address, int defined) {
for (int i = 0; i < symbolCount; i++) {
if (strcmp(symbolTable[i].name, name) == 0) {
if (defined && symbolTable[i].defined) {
fprintf(stderr, "Error: Multiple definitions of symbol %s\n", name);
exit(1);
}
if (defined) {
symbolTable[i].address = address;
symbolTable[i].defined = 1;
}
return;
}
}
// Add new symbol
strcpy(symbolTable[symbolCount].name, name);
symbolTable[symbolCount].address = address;
symbolTable[symbolCount].defined = defined;
symbolCount++;
}
// Resolve symbol addresses
int resolveSymbol(const char *name) {
for (int i = 0; i < symbolCount; i++) {
if (strcmp(symbolTable[i].name, name) == 0) {
if (!symbolTable[i].defined) {
fprintf(stderr, "Error: Undefined symbol %s\n", name);
exit(1);
}
return symbolTable[i].address;
}
}
fprintf(stderr, "Error: Symbol %s not found\n", name);
exit(1);
}
// Parse a simple object file (text-based for simplicity)
void parseObjectFile(const char *filename) {
FILE *file = fopen(filename, "r");
if (!file) {
perror("Error opening file");
exit(1);
}
char line[100];
while (fgets(line, sizeof(line), file)) {
char type;
char name[50];
int address;
if (sscanf(line, "%c %s %d", &type, name, &address) == 3) {
if (type == 'D') {
// Defined symbol
addSymbol(name, address, 1);
} else if (type == 'R') {
// Referenced symbol
addSymbol(name, 0, 0);
}
}
}
fclose(file);
}
// Link files and resolve addresses
void linkFiles(const char *outputFile) {
FILE *out = fopen(outputFile, "w");
if (!out) {
perror("Error creating output file");
exit(1);
}
fprintf(out, "Linked Symbols:\n");
for (int i = 0; i < symbolCount; i++) {
if (!symbolTable[i].defined) {
fprintf(stderr, "Error: Undefined symbol %s\n", symbolTable[i].name);
exit(1);
}
fprintf(out, "%s: %d\n", symbolTable[i].name, symbolTable[i].address);
}
fclose(out);
}
int main(int argc, char *argv[]) {
if (argc < 3) {
fprintf(stderr, "Usage: %s <output_file> <input_files...>\n", argv[0]);
return 1;
}
const char *outputFile = argv[1];
// Parse each input object file
for (int i = 2; i < argc; i++) {
parseObjectFile(argv[i]);
}
// Link files
linkFiles(outputFile);
printf("Linking completed. Output written to %s\n", outputFile);
return 0;
}
Conclusion
The proposed linker design offers a robust framework for handling static and dynamic linking in the
new programming language. By addressing performance, security, and compatibility, this design
ensures a reliable and efficient development process for developers.