FOCUS on Political Repression in Southern Africa was the news bulletin of the International Defence & Aid Fund (IDAF), published from 1975 onwards. Each issue documented detentions, political trials, banning orders, deaths in custody, and military violence across South Africa, Rhodesia/Zimbabwe, Namibia, Mozambique, and the wider region — sourced from contemporary press reporting.

This site is a research front-end for the structured archive that The Bitter Aloe Project builds from the original page scans. Layouts are produced with dots.ocr, manually corrected page-by-page, and then post-processed with Gemini 3.1 Flash Lite Preview to recover the editorial structure of each issue and surface the people, places, and organizations mentioned in every section.

Datasets

  • focus-raw-ocr — Page-level corrected dots.ocr layouts joined to the original page renders.
  • focus-processed-articles — Issue-level structured extractions: articles, summaries, and per-section people, places, and organizations.

How it works

  1. Each page is rendered to PNG and run through dots.ocr to produce a layout JSON.
  2. Annotators validate the layout and correct any OCR errors in a custom tool.
  3. Per-page corrections are joined with the rendered images and published as focus-raw-ocr.
  4. Per-issue, the corrected layouts are sent to Gemini 3.1 Flash Lite Preview, which returns a structured JSON listing each editorial section (front matter, articles, end matter), a reflowed body, a summary, and the per-section entity lists. These are published as focus-processed-articles and rendered into this site.
  5. The site is built with Astro, deployed to GitHub Pages, and indexed at build time by Pagefind for fully-static client-side search.

Citation

@misc{bitter_aloe_focus_archive,
  title  = {The FOCUS Archive},
  author = {The Bitter Aloe Project},
  year   = {2026},
  url    = {https://wjbmattingly.github.io/focus-dataset}
}

License

Code on this site is MIT-licensed. The structured datasets are released for non-commercial research use under CC-BY-NC-4.0. The underlying FOCUS publication is © the International Defence & Aid Fund and its successors.