thelinuxvault guide

Building a Custom Linux Package Manager: A Step-by-Step Guide

Linux package managers are the backbone of system administration, simplifying the installation, update, and removal of software. Tools like APT (Debian/Ubuntu), Pacman (Arch), and DNF (Fedora) are household names, but have you ever wondered how they work under the hood? Or perhaps you’ve needed a lightweight, specialized package manager for a embedded system, a custom Linux distribution, or a niche use case? In this guide, we’ll demystify the process of building a basic but functional Linux package manager from scratch. We’ll cover core concepts like package formatting, metadata management, dependency resolution, and installation logic. By the end, you’ll have a working tool to create, install, and manage custom packages—plus the knowledge to expand it further.

Table of Contents

  1. Prerequisites
  2. Core Components of a Package Manager
  3. Step 1: Define the Package Format
  4. Step 2: Build a Package Creator Tool
  5. Step 3: Metadata Storage with SQLite
  6. Step 4: Implement Installation Logic
  7. Step 5: Add Uninstallation Support
  8. Step 6: Basic Dependency Resolution
  9. Step 7: Repository Support
  10. Testing Your Package Manager
  11. Challenges and Future Improvements
  12. Conclusion
  13. References

Prerequisites

Before diving in, ensure you have:

  • Linux Environment: A Linux system (Ubuntu, Arch, or any distro) to test your package manager.
  • Programming Knowledge: Basic Python (we’ll use Python for simplicity, but you could use C/C++ or Rust).
  • Familiarity with Core Tools: Understanding of tar, gzip, file permissions, and basic system directories (e.g., /usr/bin, /etc).
  • SQLite: For storing package metadata (install via sudo apt install sqlite3 or sudo pacman -S sqlite).

Core Components of a Package Manager

A package manager needs several key components to function. Let’s break them down:

ComponentPurpose
Package FormatA structured archive (e.g., .tar.gz) containing software files and metadata.
MetadataInformation about the package: name, version, dependencies, files, etc.
Metadata StorageA database to track installed packages, their files, and versions.
Installation LogicCode to extract files, resolve dependencies, and update the system.
Dependency ResolverEnsures required software (dependencies) is installed before the package.
Repository SupportA remote server to host packages and metadata for easy distribution.

Step 1: Define the Package Format

First, we need a standard way to bundle software. Let’s design a simple package format:

Package Structure

A .mypkg package (custom extension) will be a tar.gz archive containing two parts:

  1. Payload: The actual software files (binaries, configs, etc.).
  2. Metadata File: A metadata.json file describing the package.

Example Metadata (metadata.json)

{  
  "name": "hello-world",  
  "version": "1.0.0",  
  "description": "A simple hello-world program",  
  "maintainer": "[email protected]",  
  "dependencies": ["libc6"],  // Example dependency (could be other .mypkg packages)  
  "files": [  
    "/usr/bin/hello",  
    "/usr/share/man/man1/hello.1.gz"  
  ],  
  "install_size": 12345  // Size in bytes  
}  

Why This Structure?

  • Simplicity: tar.gz is widely supported and easy to work with in Python.
  • Metadata in JSON: Human-readable and machine-parsable.
  • Explicit File List: Ensures we can track and remove files during uninstallation.

Step 2: Build a Package Creator Tool

Next, we need a script to generate .mypkg packages from a directory of software files. Let’s write a Python tool called mypkg-build.

Step 2.1: Script Logic

  1. Take a source directory (e.g., hello-world-src/) containing software files.
  2. Generate metadata.json (user provides metadata via CLI flags).
  3. Archive the source directory and metadata.json into a .mypkg file.

Step 2.2: Code for mypkg-build

Create mypkg-build (save as ~/mypkg-manager/mypkg-build):

#!/usr/bin/env python3  
import argparse  
import json  
import tarfile  
import os  

def main():  
    parser = argparse.ArgumentParser(description='Build a .mypkg package')  
    parser.add_argument('--name', required=True, help='Package name')  
    parser.add_argument('--version', required=True, help='Package version')  
    parser.add_argument('--src-dir', required=True, help='Source directory with files')  
    parser.add_argument('--dependencies', nargs='*', default=[], help='Dependencies')  
    args = parser.parse_args()  

    # Step 1: Generate metadata.json  
    metadata = {  
        "name": args.name,  
        "version": args.version,  
        "dependencies": args.dependencies,  
        "files": [],  # We'll populate this next  
        "install_size": 0  
    }  

    # Step 2: Collect files and calculate size  
    total_size = 0  
    for root, _, files in os.walk(args.src_dir):  
        for file in files:  
            file_path = os.path.join(root, file)  
            # Store absolute path (e.g., if src_dir has usr/bin/hello, file becomes /usr/bin/hello)  
            rel_path = os.path.relpath(file_path, args.src_dir)  
            abs_path = f"/{rel_path}"  
            metadata["files"].append(abs_path)  
            total_size += os.path.getsize(file_path)  
    metadata["install_size"] = total_size  

    # Save metadata to src_dir/metadata.json  
    with open(f"{args.src_dir}/metadata.json", "w") as f:  
        json.dump(metadata, f, indent=2)  

    # Step 3: Create .mypkg (tar.gz) package  
    pkg_name = f"{args.name}-{args.version}.mypkg"  
    with tarfile.open(pkg_name, "w:gz") as tar:  
        tar.add(args.src_dir, arcname="")  # Add all files in src_dir to the root of the archive  

    print(f"Package created: {pkg_name}")  

if __name__ == "__main__":  
    main()  

Make It Executable

chmod +x mypkg-build  

Test the Tool

Create a test package:

# Create a source directory with a sample binary  
mkdir -p hello-world-src/usr/bin  
echo -e '#!/bin/sh\n echo "Hello, Custom Package Manager!"' > hello-world-src/usr/bin/hello  
chmod +x hello-world-src/usr/bin/hello  

# Build the package  
./mypkg-build --name hello-world --version 1.0.0 --src-dir hello-world-src --dependencies libc6  

You’ll get hello-world-1.0.0.mypkg!

Step 3: Metadata Storage with SQLite

To track installed packages, we’ll use an SQLite database. Let’s design the schema.

Database Schema

Create a database mypkg.db with three tables:

  1. packages: Stores package metadata.

    CREATE TABLE packages (  
        id INTEGER PRIMARY KEY AUTOINCREMENT,  
        name TEXT UNIQUE NOT NULL,  
        version TEXT NOT NULL,  
        install_size INTEGER NOT NULL,  
        installed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP  
    );  
  2. files: Maps files to their packages (to track what to uninstall).

    CREATE TABLE files (  
        id INTEGER PRIMARY KEY AUTOINCREMENT,  
        package_id INTEGER NOT NULL,  
        path TEXT UNIQUE NOT NULL,  
        FOREIGN KEY (package_id) REFERENCES packages(id) ON DELETE CASCADE  
    );  
  3. dependencies: Tracks which packages depend on others.

    CREATE TABLE dependencies (  
        id INTEGER PRIMARY KEY AUTOINCREMENT,  
        package_id INTEGER NOT NULL,  
        dependency_name TEXT NOT NULL,  
        FOREIGN KEY (package_id) REFERENCES packages(id) ON DELETE CASCADE  
    );  

Initialize the Database

Create a Python script init_db.py to set up the schema:

import sqlite3  

conn = sqlite3.connect("mypkg.db")  
cursor = conn.cursor()  

# Create tables  
cursor.execute("""  
CREATE TABLE IF NOT EXISTS packages (  
    id INTEGER PRIMARY KEY AUTOINCREMENT,  
    name TEXT UNIQUE NOT NULL,  
    version TEXT NOT NULL,  
    install_size INTEGER NOT NULL,  
    installed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP  
)  
""")  

cursor.execute("""  
CREATE TABLE IF NOT EXISTS files (  
    id INTEGER PRIMARY KEY AUTOINCREMENT,  
    package_id INTEGER NOT NULL,  
    path TEXT UNIQUE NOT NULL,  
    FOREIGN KEY (package_id) REFERENCES packages(id) ON DELETE CASCADE  
)  
""")  

cursor.execute("""  
CREATE TABLE IF NOT EXISTS dependencies (  
    id INTEGER PRIMARY KEY AUTOINCREMENT,  
    package_id INTEGER NOT NULL,  
    dependency_name TEXT NOT NULL,  
    FOREIGN KEY (package_id) REFERENCES packages(id) ON DELETE CASCADE  
)  
""")  

conn.commit()  
conn.close()  
print("Database initialized: mypkg.db")  

Run it: python3 init_db.py

Step 4: Implement Installation Logic

Now, let’s build the installer tool mypkg-install to:

  1. Extract the .mypkg archive.
  2. Validate metadata.
  3. Resolve dependencies (basic check).
  4. Copy files to the system.
  5. Update the database.

Code for mypkg-install

#!/usr/bin/env python3  
import argparse  
import tarfile  
import json  
import os  
import sqlite3  
import shutil  

def get_db_connection():  
    return sqlite3.connect("mypkg.db")  

def check_dependencies(metadata):  
    """Check if all dependencies are installed."""  
    conn = get_db_connection()  
    cursor = conn.cursor()  
    missing = []  
    for dep in metadata["dependencies"]:  
        cursor.execute("SELECT name FROM packages WHERE name = ?", (dep,))  
        if not cursor.fetchone():  
            missing.append(dep)  
    conn.close()  
    return missing  

def install_package(pkg_path):  
    # Step 1: Extract the package  
    with tarfile.open(pkg_path, "r:gz") as tar:  
        # Extract metadata.json first  
        tar.extract("metadata.json")  
        with open("metadata.json") as f:  
            metadata = json.load(f)  

        # Step 2: Check dependencies  
        missing_deps = check_dependencies(metadata)  
        if missing_deps:  
            print(f"Error: Missing dependencies: {', '.join(missing_deps)}")  
            os.remove("metadata.json")  # Cleanup  
            return  

        # Step 3: Extract files to system (e.g., /usr/bin, /usr/share)  
        for file in metadata["files"]:  
            # Extract from tar (tar contains relative paths like usr/bin/hello)  
            src = file.lstrip("/")  # Remove leading / to get tar path  
            try:  
                tar.extract(src, path="/")  # Extract to root (requires sudo!)  
                print(f"Installed: {file}")  
            except KeyError:  
                print(f"Warning: File {src} not found in package.")  

        # Step 4: Update database  
        conn = get_db_connection()  
        cursor = conn.cursor()  

        # Insert package  
        cursor.execute("""  
        INSERT INTO packages (name, version, install_size)  
        VALUES (?, ?, ?)  
        """, (metadata["name"], metadata["version"], metadata["install_size"]))  
        pkg_id = cursor.lastrowid  

        # Insert files  
        for file in metadata["files"]:  
            cursor.execute("INSERT INTO files (package_id, path) VALUES (?, ?)", (pkg_id, file))  

        # Insert dependencies  
        for dep in metadata["dependencies"]:  
            cursor.execute("INSERT INTO dependencies (package_id, dependency_name) VALUES (?, ?)", (pkg_id, dep))  

        conn.commit()  
        conn.close()  
        os.remove("metadata.json")  # Cleanup  
        print(f"Successfully installed {metadata['name']} v{metadata['version']}")  

if __name__ == "__main__":  
    parser = argparse.ArgumentParser(description='Install a .mypkg package')  
    parser.add_argument('pkg_path', help='Path to .mypkg file')  
    args = parser.parse_args()  
    install_package(args.pkg_path)  

Make It Executable

chmod +x mypkg-install  

Step 5: Add Uninstallation Support

To uninstall, we need to:

  1. Find all files associated with the package.
  2. Remove them from the system.
  3. Delete the package from the database.

Code for mypkg-remove

#!/usr/bin/env python3  
import argparse  
import sqlite3  
import os  

def get_db_connection():  
    return sqlite3.connect("mypkg.db")  

def uninstall_package(pkg_name):  
    conn = get_db_connection()  
    cursor = conn.cursor()  

    # Get package ID and files  
    cursor.execute("SELECT id FROM packages WHERE name = ?", (pkg_name,))  
    pkg = cursor.fetchone()  
    if not pkg:  
        print(f"Error: Package {pkg_name} not installed.")  
        return  

    pkg_id = pkg[0]  

    # Get files to remove  
    cursor.execute("SELECT path FROM files WHERE package_id = ?", (pkg_id,))  
    files = [row[0] for row in cursor.fetchall()]  

    # Remove files  
    for file in files:  
        if os.path.exists(file):  
            os.remove(file)  
            print(f"Removed: {file}")  
        else:  
            print(f"Warning: {file} not found (already removed?)")  

    # Delete package from database (cascade deletes files/dependencies)  
    cursor.execute("DELETE FROM packages WHERE id = ?", (pkg_id,))  
    conn.commit()  
    conn.close()  
    print(f"Uninstalled {pkg_name}")  

if __name__ == "__main__":  
    parser = argparse.ArgumentParser(description='Uninstall a package')  
    parser.add_argument('pkg_name', help='Name of package to uninstall')  
    args = parser.parse_args()  
    uninstall_package(args.pkg_name)  

Make it executable: chmod +x mypkg-remove

Step 6: Basic Dependency Resolution

Our current check_dependencies function only verifies if dependencies are installed. For a real-world tool, we’d need to:

  • Fetch dependencies from a repository.
  • Resolve version conflicts (e.g., Package A needs libX v1, Package B needs libX v2).

Improving the Dependency Check

Update check_dependencies in mypkg-install to suggest installing missing deps:

def check_dependencies(metadata):  
    conn = get_db_connection()  
    cursor = conn.cursor()  
    missing = []  
    for dep in metadata["dependencies"]:  
        cursor.execute("SELECT name FROM packages WHERE name = ?", (dep,))  
        if not cursor.fetchone():  
            missing.append(dep)  
    conn.close()  
    return missing  

# In install_package():  
missing_deps = check_dependencies(metadata)  
if missing_deps:  
    print(f"Error: Missing dependencies: {', '.join(missing_deps)}")  
    print(f"Install them first with: mypkg-install <dep1.mypkg> <dep2.mypkg>")  
    os.remove("metadata.json")  
    return  

Step 7: Repository Support

To distribute packages, host them on a remote server (e.g., Apache, Nginx, or even GitHub Pages).

Repository Structure

  • A packages.list file listing all .mypkg packages and their metadata.
  • A directory pkgs/ containing the .mypkg files.

Example packages.list

[  
  {  
    "name": "hello-world",  
    "version": "1.0.0",  
    "url": "https://your-server.com/pkgs/hello-world-1.0.0.mypkg",  
    "dependencies": ["libc6"]  
  },  
  {  
    "name": "libc6",  
    "version": "2.31",  
    "url": "https://your-server.com/pkgs/libc6-2.31.mypkg",  
    "dependencies": []  
  }  
]  

Add mypkg-update to Fetch Repo Metadata

Write a script to download packages.list and store it locally for dependency checks.


Testing the Package Manager

Let’s test our tools end-to-end:

  1. Build a test package:

    ./mypkg-build --name hello-world --version 1.0.0 --src-dir hello-world-src --dependencies libc6  
  2. Install it (requires sudo to write to /usr/bin):

    sudo ./mypkg-install hello-world-1.0.0.mypkg  
  3. Verify installation:

    hello  # Should print "Hello, Custom Package Manager!"  
    sqlite3 mypkg.db "SELECT name, version FROM packages;"  # Should show hello-world 1.0.0  
  4. Uninstall:

    sudo ./mypkg-remove hello-world  
    hello  # Should error (command not found)  

Challenges and Future Improvements

Building a production-ready package manager requires solving harder problems:

  • Dependency Resolution: Use topological sorting to handle complex dependency graphs (see APT’s resolver).
  • File Conflicts: Check if a file is already owned by another package before overwriting.
  • Upgrades: Support updating packages to newer versions (e.g., mypkg-upgrade hello-world).
  • Signing: Digitally sign packages to prevent tampering (use GPG).
  • Compression: Use zstd instead of gzip for faster extraction.

Conclusion

You’ve built a basic but functional Linux package manager! You now understand how packages are structured, metadata is tracked, and software is installed/uninstalled.

This is just the beginning—expand it with features like repository support, upgrades, or a GUI frontend. The sky’s the limit!

References