Go to file
yafox 20cf09453b
fix .gitignore
2020-11-25 06:02:42 +00:00
gpg initial commit 2020-11-24 19:33:42 +00:00
lib initial commit 2020-11-24 19:33:42 +00:00
.gitignore fix .gitignore 2020-11-25 06:02:42 +00:00
LICENSE initial commit 2020-11-24 19:33:42 +00:00
README initial commit 2020-11-24 19:33:42 +00:00
USAGE initial commit 2020-11-24 19:33:42 +00:00
makefile initial commit 2020-11-24 19:33:42 +00:00
sloc.sh initial commit 2020-11-24 19:33:42 +00:00
src.sh initial commit 2020-11-24 19:33:42 +00:00

README

src.sh
===

usage: `src.sh <package name> [<version> [<archive type>]]`

src.sh pulls source code and checks signatures, checksums, and file sizes.
it does not apply patches.  this is not an accident.  applying patches was part
of src.sh's prototype, but patches are more relevant to the configuration and
build stages of package management than to the source acquisition stage.  sets
of patches and build environments are tightly coupled by nature.

src.sh is used as part of a source based package manager, but it as meant to be 
used on its own as well, or as part of any other source based package manager.
it does what it does, and it does it as simply as possible.  commits may be few
and far between just because not much may ever need to be done to it.

it supports git repositories and tarballs as sources (or "archive types").  it
can be extended by adding archive type definition scripts to the `lib/type`
directory.

it aims to be easy to understand and customise.  the codebase for src.sh is only
260 SLOC long.  (`sloc.sh` contains the code used to arrive at the SLOC count.)

all package definitions are kept in a separate repository.  this makes it easier
for users to supply their own package definitions.  to use a package definition
repository, clone it into `$SRCROOT/pkg`.  to maintain multiple package
definition repositories, clone them elsewhere and symlink `$SRCROOT/pkg` to
whichever one should be active at any given time.

files in SRCROOT are assumed to be trustworthy.  no sanitisation is done on the
`urls`, `sigurls`, etc. files.  do not populate your `pkg` directory with files
you either have not carefully examined yourself or which are from sources you
would not let run arbitrary commands on your computer!


## installation

clone this repository somewhere, set the SRCROOT variable to point to that
location, and use `src.sh` directly.  creating a symlink to `src` somewhere
in PATH is recommended for convenience's sake. (e.g., to call `src.sh` via `src`
throughout the whole system, `ln -s $SRCROOT/src.sh /bin/src`.)


## gpg

PGP signatures are checked if gpg is installed and if signatures are available.
for the check to pass, the project's maintainer keys must be on the "keyring"
gpg uses.  to avoid cluttering up the user's personal keyring, src maintains its
own gpg "home" directory at `$SRCROOT/gpg`.  by default this directory is empty
except for a CSV containing a mapping of project and maintainer names to PGP
fingerprints and URLs pointing to the resources which were used to find the
fingerprint.  this can be used to import keys as needed and to independently
verify that the correct fingerprint is listed.

for example, importing all the linux-kernel maintainer keys from the keyserver
hosted by University of Mainz:

```
cd "$SRCROOT/gpg"
grep '^linux-kernel,' fingerprints.csv | cut -d, -f3 | while read print; do
    gpg --homedir=. --keyserver=pgp.uni-mainz.de --recv-keys "$print"
done
```

because keyservers can be unreliable, a signed repository containing all the
public keys referenced in fingerprints.csv can also be found at
http://git.fuwafuwaqtlkkxwc.onion/yafox/src-keys


## defining a package

multiple archive types are supported, but care should be taken to ensure all
archive types specified in a package produce the same file structure and file
contents.  if different archive types producing different source code listings
are desired, split them up into different packages.  a package-version pair
should always produce the same source code, regardless of how the source code
was retrieved.

packages are kept in the `pkg` directory. package definitions consist of a
directory in `pkg` containing a `checks` file and a `urls` file.

for example:

    <some package>/
     |-- checks
     |-- urls

the `urls` file is a tab-delimited list associating urls, content types, and
archive types. each line is formatted as follows:

    <version>	<archive type>	<content type>	<url>

where <version> is the version of the package source code requested, <archive
type> is a string corresponding to a type definition script in `lib/type`, minus
the ".sh" file extension, <content type> is "arc" or "sig" (indicating whether
the url points to the ARChive or the SIGnature for the archive), and <url> is
the url at which the described content may be retrieved.

optionally, a package may also contain a `defaults.sh` file.  building on the
first example:

    <some package>/
     |-- checks
     |-- defaults.sh
     |-- urls


`defaults.sh` is a small shell script that defines default values for `version`
and `type`.  if the user does not specify the version and type of archive they
prefer, whatever values are defined in this file will be used.

for example:

    version="0.0.1"
    type="git"

the `checks` file is a list of tab-delimited lines in this format:

    <version> <type> <arc or sig> <size> <checksum>

- version: the package version the check is for.

- type: the archive type the check is for.

- arc or sig: literally `arc` or `sig`.  indicates whether this check is for an
  archive or a signature of an archive.

- size: the expected size of the archive or signature.  signature sizes are
  always in bytes.  archive sizes depend on the archive type script.  for
  tarballs, it's bytes.  for git, it's the working directory's "bytes on disk,"
  or `du -sk` times 1,024.  if the calculated size and the expected size don't
  match, the check fails.

- checksum: the expected checksum of the archive or signature.  signature
  checksums are always sha512 checksums.  archive checksums depend on the
  archive type.  for tarballs, it's just a sha512 checksum.  for git, it's the
  sha512 checksum of all the sha512 checksums of all the files in the working
  directory sorted, excluding the `.git` directory.  if the expected checksum
  does not match the calculated checksum, the check fails.

once a package has been pulled, a few more directories may appear, depending
on the behavior of the archive type definition scripts in the user's version of
src.sh. a package whose user has depended on tarballs and git at various times
may look like this:

    <some package>/
     |-- git/ # bare git repository; not going to expand this one!
     |
     |-- sig/
     |    |-- 0.01.tar.gz.sig
     |    |-- 0.02.tar.gz.sig
     |    |-- 1.00.tar.gz.sig
     |    |-- 1.01.tar.gz.sig # and so on
     |
     |-- tarball/
     |    |-- <some package>-0.01.tar.gz
     |    |-- <some package>-0.02.tar.gz
     |    |-- <some package>-1.00.tar.gz
     |    |-- <some package>-1.01.tar.gz # and so on
     |
     |-- checks
     |-- defaults.sh
     |-- urls


in this example:

- `git` contains a bare git respository.

- `sig` contains all the signatures downloaded from `tar.gz-sigurls` throughout
   the package's history and organized by `type` and `version`.

- `tarball` contains all the `tar.gz` files downloaded from `tar.gz-urls`,
   renamed as `<version>.tar.gz`.

also, the path set in SRCREPO will contain the extracted source code or working
trees (depending on archive type), one directory per version:

    $SRCREPO/ # like /src or $SRCROOT/code or something
    |-- <some package>/
    |    |-- 0.01/
    |    |-- 0.02/
    |    |-- 1.00/
    |    |-- 1.01/ # and so on

## defining archive types

the definitions in `lib/type` are good examples in themselves, but in short, an
archive definition file consists of a script defining the following functions:

- get_archive
- sigcheck_archive
- unpack_archive
- archive_name
- archive_size
- archive_checksum