kazbek/README.md

118 lines
3.7 KiB
Markdown

The repository contains a collection of random scripts and short programs.
# Requirements
- Ruby 3 for Ruby scripts in `bin/`.
- For Haskell: The GHC compiler and cabal build system. The programs can be then
built and run with `cabal run program-name -- --options`.
# Contents
1. [7digital.rb](#7digitalrb)
2. [mock\_server.rb](#mock_serverrb)
3. [read\_logs.rb](#read_logsrb)
4. [cross\_toolchain.rb](#cross_toolchainrb)
5. [rename.rb](#renamerb)
6. [tea-cleaner](#tea-cleaner)
## 7digital.rb
7digital sells digital music but they can't handle files with non-English names.
`bin/7digital.rb` takes 2 arguments, a zip archive with audio files and a target
directory. It extracts the archive into the directory and renames its contents
according to the meta information saved in the audio files. The audio files are
expected to be in 2 directories, the artist and album directories. These
directories are also renamed.
## mock\_server.rb
`bin/mock\_server.rb` takes some JSON on its STDIN and starts a simple HTTP server
that slowly (in chunks) answers all requests with the given input.
For example:
```sh
echo '{"var": "stuff"}' | ./bin/mock_server.rb
```
and in another session:
```
curl localhost:8082
```
## read\_logs.rb
`bin/read\_logs.rb` looks in the `log/` directory for files ending with `.log`,
`.log.1`, `.log.2.gz`, `.log.3.gz` and so forth. It filters out lines starting
with a timestamp, `yyyy-mm-ddThh:mm:ss`, followed by random characters and a
custom string provided as the only command line parameter. Finally
it outputs all matched content after the provided string along with the date.
The log files are read in the order based on the number in the filename.
For example calling the script as
```sh
./bin/read_logs.rb 'doctrine.INFO:'
```
on a log file containing
`[2025-02-04T19:51:49.356093+01:00] doctrine.INFO: Disconnecting [] []`
will print:
```
2025-02-04 (Disconnecting [])
```
## cross\_toolchain.rb
`bin/cross_toolchain.rb` builds a cross toolchain for 32-bit RISC-V (G). The
script should work on Mac OS with preinstalled GNU tools and case-sensitive file
system and Linux. The resulting GCC is to be found in
`./tmp/rootfs/bin/riscv32-unknown-linux-gnu-*`.
## rename.rb
Changes the extension of all files matching a pattern in the given directory.
Call the `rename.rb` without arguments to see the usage information and an
example.
## tea-cleaner
`tea-cleaner` tries to detect spam accounts on a gitea instance and can remove
them automatically.
### Run instructions
See `tea-cleaner.toml.dist` for a description of the available configuration.
Copy this file to `config/tea-cleaner.toml` and change at least `token` and
`server` values. After that if you just run `tea-cleaner` it will give a list
of user accounts which look suspicious to it. Rerunning the command with the
`--live-run` flag will purge the listed accounts and all their activities,
assuming the given token has amdinistrative access to the Gitea instance.
Run `tea-cleanr --help` to see all available command line options.
### Applied rules
Critical:
- The account is elder than a month and the user hasn't logged in since then.
- User information contains banned words (can be adjusted in the configuration file).
- User's homepage contains percent encoded symbols.
Possible:
- User filled fields for personal information: description and website.
- The mail address domain is unusual (can be adjusted in the configuration file).
The accounts that violate one of the critical rules are marked for removal
right away. Other checks trigger an additional lookup for the last user
activities. If everything user did was creating an empty repository, these
accounts are marked for removal as well.
The rules are based on my invastigation of spam accounts on this instance.