Table of Contents
- Prerequisites
- Core Components of a Package Manager
- Step 1: Define the Package Format
- Step 2: Build a Package Creator Tool
- Step 3: Metadata Storage with SQLite
- Step 4: Implement Installation Logic
- Step 5: Add Uninstallation Support
- Step 6: Basic Dependency Resolution
- Step 7: Repository Support
- Testing Your Package Manager
- Challenges and Future Improvements
- Conclusion
- References
Prerequisites
Before diving in, ensure you have:
- Linux Environment: A Linux system (Ubuntu, Arch, or any distro) to test your package manager.
- Programming Knowledge: Basic Python (we’ll use Python for simplicity, but you could use C/C++ or Rust).
- Familiarity with Core Tools: Understanding of
tar,gzip, file permissions, and basic system directories (e.g.,/usr/bin,/etc). - SQLite: For storing package metadata (install via
sudo apt install sqlite3orsudo pacman -S sqlite).
Core Components of a Package Manager
A package manager needs several key components to function. Let’s break them down:
| Component | Purpose |
|---|---|
| Package Format | A structured archive (e.g., .tar.gz) containing software files and metadata. |
| Metadata | Information about the package: name, version, dependencies, files, etc. |
| Metadata Storage | A database to track installed packages, their files, and versions. |
| Installation Logic | Code to extract files, resolve dependencies, and update the system. |
| Dependency Resolver | Ensures required software (dependencies) is installed before the package. |
| Repository Support | A remote server to host packages and metadata for easy distribution. |
Step 1: Define the Package Format
First, we need a standard way to bundle software. Let’s design a simple package format:
Package Structure
A .mypkg package (custom extension) will be a tar.gz archive containing two parts:
- Payload: The actual software files (binaries, configs, etc.).
- Metadata File: A
metadata.jsonfile describing the package.
Example Metadata (metadata.json)
{
"name": "hello-world",
"version": "1.0.0",
"description": "A simple hello-world program",
"maintainer": "[email protected]",
"dependencies": ["libc6"], // Example dependency (could be other .mypkg packages)
"files": [
"/usr/bin/hello",
"/usr/share/man/man1/hello.1.gz"
],
"install_size": 12345 // Size in bytes
}
Why This Structure?
- Simplicity:
tar.gzis widely supported and easy to work with in Python. - Metadata in JSON: Human-readable and machine-parsable.
- Explicit File List: Ensures we can track and remove files during uninstallation.
Step 2: Build a Package Creator Tool
Next, we need a script to generate .mypkg packages from a directory of software files. Let’s write a Python tool called mypkg-build.
Step 2.1: Script Logic
- Take a source directory (e.g.,
hello-world-src/) containing software files. - Generate
metadata.json(user provides metadata via CLI flags). - Archive the source directory and
metadata.jsoninto a.mypkgfile.
Step 2.2: Code for mypkg-build
Create mypkg-build (save as ~/mypkg-manager/mypkg-build):
#!/usr/bin/env python3
import argparse
import json
import tarfile
import os
def main():
parser = argparse.ArgumentParser(description='Build a .mypkg package')
parser.add_argument('--name', required=True, help='Package name')
parser.add_argument('--version', required=True, help='Package version')
parser.add_argument('--src-dir', required=True, help='Source directory with files')
parser.add_argument('--dependencies', nargs='*', default=[], help='Dependencies')
args = parser.parse_args()
# Step 1: Generate metadata.json
metadata = {
"name": args.name,
"version": args.version,
"dependencies": args.dependencies,
"files": [], # We'll populate this next
"install_size": 0
}
# Step 2: Collect files and calculate size
total_size = 0
for root, _, files in os.walk(args.src_dir):
for file in files:
file_path = os.path.join(root, file)
# Store absolute path (e.g., if src_dir has usr/bin/hello, file becomes /usr/bin/hello)
rel_path = os.path.relpath(file_path, args.src_dir)
abs_path = f"/{rel_path}"
metadata["files"].append(abs_path)
total_size += os.path.getsize(file_path)
metadata["install_size"] = total_size
# Save metadata to src_dir/metadata.json
with open(f"{args.src_dir}/metadata.json", "w") as f:
json.dump(metadata, f, indent=2)
# Step 3: Create .mypkg (tar.gz) package
pkg_name = f"{args.name}-{args.version}.mypkg"
with tarfile.open(pkg_name, "w:gz") as tar:
tar.add(args.src_dir, arcname="") # Add all files in src_dir to the root of the archive
print(f"Package created: {pkg_name}")
if __name__ == "__main__":
main()
Make It Executable
chmod +x mypkg-build
Test the Tool
Create a test package:
# Create a source directory with a sample binary
mkdir -p hello-world-src/usr/bin
echo -e '#!/bin/sh\n echo "Hello, Custom Package Manager!"' > hello-world-src/usr/bin/hello
chmod +x hello-world-src/usr/bin/hello
# Build the package
./mypkg-build --name hello-world --version 1.0.0 --src-dir hello-world-src --dependencies libc6
You’ll get hello-world-1.0.0.mypkg!
Step 3: Metadata Storage with SQLite
To track installed packages, we’ll use an SQLite database. Let’s design the schema.
Database Schema
Create a database mypkg.db with three tables:
-
packages: Stores package metadata.CREATE TABLE packages ( id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT UNIQUE NOT NULL, version TEXT NOT NULL, install_size INTEGER NOT NULL, installed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); -
files: Maps files to their packages (to track what to uninstall).CREATE TABLE files ( id INTEGER PRIMARY KEY AUTOINCREMENT, package_id INTEGER NOT NULL, path TEXT UNIQUE NOT NULL, FOREIGN KEY (package_id) REFERENCES packages(id) ON DELETE CASCADE ); -
dependencies: Tracks which packages depend on others.CREATE TABLE dependencies ( id INTEGER PRIMARY KEY AUTOINCREMENT, package_id INTEGER NOT NULL, dependency_name TEXT NOT NULL, FOREIGN KEY (package_id) REFERENCES packages(id) ON DELETE CASCADE );
Initialize the Database
Create a Python script init_db.py to set up the schema:
import sqlite3
conn = sqlite3.connect("mypkg.db")
cursor = conn.cursor()
# Create tables
cursor.execute("""
CREATE TABLE IF NOT EXISTS packages (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT UNIQUE NOT NULL,
version TEXT NOT NULL,
install_size INTEGER NOT NULL,
installed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
""")
cursor.execute("""
CREATE TABLE IF NOT EXISTS files (
id INTEGER PRIMARY KEY AUTOINCREMENT,
package_id INTEGER NOT NULL,
path TEXT UNIQUE NOT NULL,
FOREIGN KEY (package_id) REFERENCES packages(id) ON DELETE CASCADE
)
""")
cursor.execute("""
CREATE TABLE IF NOT EXISTS dependencies (
id INTEGER PRIMARY KEY AUTOINCREMENT,
package_id INTEGER NOT NULL,
dependency_name TEXT NOT NULL,
FOREIGN KEY (package_id) REFERENCES packages(id) ON DELETE CASCADE
)
""")
conn.commit()
conn.close()
print("Database initialized: mypkg.db")
Run it: python3 init_db.py
Step 4: Implement Installation Logic
Now, let’s build the installer tool mypkg-install to:
- Extract the
.mypkgarchive. - Validate metadata.
- Resolve dependencies (basic check).
- Copy files to the system.
- Update the database.
Code for mypkg-install
#!/usr/bin/env python3
import argparse
import tarfile
import json
import os
import sqlite3
import shutil
def get_db_connection():
return sqlite3.connect("mypkg.db")
def check_dependencies(metadata):
"""Check if all dependencies are installed."""
conn = get_db_connection()
cursor = conn.cursor()
missing = []
for dep in metadata["dependencies"]:
cursor.execute("SELECT name FROM packages WHERE name = ?", (dep,))
if not cursor.fetchone():
missing.append(dep)
conn.close()
return missing
def install_package(pkg_path):
# Step 1: Extract the package
with tarfile.open(pkg_path, "r:gz") as tar:
# Extract metadata.json first
tar.extract("metadata.json")
with open("metadata.json") as f:
metadata = json.load(f)
# Step 2: Check dependencies
missing_deps = check_dependencies(metadata)
if missing_deps:
print(f"Error: Missing dependencies: {', '.join(missing_deps)}")
os.remove("metadata.json") # Cleanup
return
# Step 3: Extract files to system (e.g., /usr/bin, /usr/share)
for file in metadata["files"]:
# Extract from tar (tar contains relative paths like usr/bin/hello)
src = file.lstrip("/") # Remove leading / to get tar path
try:
tar.extract(src, path="/") # Extract to root (requires sudo!)
print(f"Installed: {file}")
except KeyError:
print(f"Warning: File {src} not found in package.")
# Step 4: Update database
conn = get_db_connection()
cursor = conn.cursor()
# Insert package
cursor.execute("""
INSERT INTO packages (name, version, install_size)
VALUES (?, ?, ?)
""", (metadata["name"], metadata["version"], metadata["install_size"]))
pkg_id = cursor.lastrowid
# Insert files
for file in metadata["files"]:
cursor.execute("INSERT INTO files (package_id, path) VALUES (?, ?)", (pkg_id, file))
# Insert dependencies
for dep in metadata["dependencies"]:
cursor.execute("INSERT INTO dependencies (package_id, dependency_name) VALUES (?, ?)", (pkg_id, dep))
conn.commit()
conn.close()
os.remove("metadata.json") # Cleanup
print(f"Successfully installed {metadata['name']} v{metadata['version']}")
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Install a .mypkg package')
parser.add_argument('pkg_path', help='Path to .mypkg file')
args = parser.parse_args()
install_package(args.pkg_path)
Make It Executable
chmod +x mypkg-install
Step 5: Add Uninstallation Support
To uninstall, we need to:
- Find all files associated with the package.
- Remove them from the system.
- Delete the package from the database.
Code for mypkg-remove
#!/usr/bin/env python3
import argparse
import sqlite3
import os
def get_db_connection():
return sqlite3.connect("mypkg.db")
def uninstall_package(pkg_name):
conn = get_db_connection()
cursor = conn.cursor()
# Get package ID and files
cursor.execute("SELECT id FROM packages WHERE name = ?", (pkg_name,))
pkg = cursor.fetchone()
if not pkg:
print(f"Error: Package {pkg_name} not installed.")
return
pkg_id = pkg[0]
# Get files to remove
cursor.execute("SELECT path FROM files WHERE package_id = ?", (pkg_id,))
files = [row[0] for row in cursor.fetchall()]
# Remove files
for file in files:
if os.path.exists(file):
os.remove(file)
print(f"Removed: {file}")
else:
print(f"Warning: {file} not found (already removed?)")
# Delete package from database (cascade deletes files/dependencies)
cursor.execute("DELETE FROM packages WHERE id = ?", (pkg_id,))
conn.commit()
conn.close()
print(f"Uninstalled {pkg_name}")
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Uninstall a package')
parser.add_argument('pkg_name', help='Name of package to uninstall')
args = parser.parse_args()
uninstall_package(args.pkg_name)
Make it executable: chmod +x mypkg-remove
Step 6: Basic Dependency Resolution
Our current check_dependencies function only verifies if dependencies are installed. For a real-world tool, we’d need to:
- Fetch dependencies from a repository.
- Resolve version conflicts (e.g., Package A needs libX v1, Package B needs libX v2).
Improving the Dependency Check
Update check_dependencies in mypkg-install to suggest installing missing deps:
def check_dependencies(metadata):
conn = get_db_connection()
cursor = conn.cursor()
missing = []
for dep in metadata["dependencies"]:
cursor.execute("SELECT name FROM packages WHERE name = ?", (dep,))
if not cursor.fetchone():
missing.append(dep)
conn.close()
return missing
# In install_package():
missing_deps = check_dependencies(metadata)
if missing_deps:
print(f"Error: Missing dependencies: {', '.join(missing_deps)}")
print(f"Install them first with: mypkg-install <dep1.mypkg> <dep2.mypkg>")
os.remove("metadata.json")
return
Step 7: Repository Support
To distribute packages, host them on a remote server (e.g., Apache, Nginx, or even GitHub Pages).
Repository Structure
- A
packages.listfile listing all.mypkgpackages and their metadata. - A directory
pkgs/containing the.mypkgfiles.
Example packages.list
[
{
"name": "hello-world",
"version": "1.0.0",
"url": "https://your-server.com/pkgs/hello-world-1.0.0.mypkg",
"dependencies": ["libc6"]
},
{
"name": "libc6",
"version": "2.31",
"url": "https://your-server.com/pkgs/libc6-2.31.mypkg",
"dependencies": []
}
]
Add mypkg-update to Fetch Repo Metadata
Write a script to download packages.list and store it locally for dependency checks.
Testing the Package Manager
Let’s test our tools end-to-end:
-
Build a test package:
./mypkg-build --name hello-world --version 1.0.0 --src-dir hello-world-src --dependencies libc6 -
Install it (requires
sudoto write to/usr/bin):sudo ./mypkg-install hello-world-1.0.0.mypkg -
Verify installation:
hello # Should print "Hello, Custom Package Manager!" sqlite3 mypkg.db "SELECT name, version FROM packages;" # Should show hello-world 1.0.0 -
Uninstall:
sudo ./mypkg-remove hello-world hello # Should error (command not found)
Challenges and Future Improvements
Building a production-ready package manager requires solving harder problems:
- Dependency Resolution: Use topological sorting to handle complex dependency graphs (see APT’s resolver).
- File Conflicts: Check if a file is already owned by another package before overwriting.
- Upgrades: Support updating packages to newer versions (e.g.,
mypkg-upgrade hello-world). - Signing: Digitally sign packages to prevent tampering (use GPG).
- Compression: Use
zstdinstead ofgzipfor faster extraction.
Conclusion
You’ve built a basic but functional Linux package manager! You now understand how packages are structured, metadata is tracked, and software is installed/uninstalled.
This is just the beginning—expand it with features like repository support, upgrades, or a GUI frontend. The sky’s the limit!