aggregation-framework

A Swiss-army knife library for scraping and processing data from the web. Provides a unified interface for multiple different HTTP clients, and convenience functionality for parsing and preprocessing data for your applications to use.

Quickly build HTTP requests for a variety of data formats and APIs.
Parse common data formats such as XML, HTML, and JSON.
Push your aggregated data automatically to your preferred database (such as Kafka, MySQL, or Postgres).
Write your own collectors for non-standard data formats.

graph LR
    EXT1[(External HTTP API)]
    EXT2[(External HTTP API)]
    EXT3[(External HTTP API)]

    COL1[/Collector/]
    COL2[/Collector/]
    COL3[/Collector/]
    
    DB[(Application Database)]
    BE1[Backend Application]
    BE2[Backend Application]
    BE3[Backend Application]
  
    subgraph AP[Aggregation Framework]
        COL1
        COL2
        COL3
    end
    
    EXT1 --> COL1
    EXT2 --> COL2
    EXT3 --> COL3
    
    COL1 & COL2 & COL3 --> DB --> BE1 & BE2 & BE3

Get Started

Add Aggregation Framework and your preferred extensions to your project. For sbt:

// add Forge as a resolver
resolvers += "Gitea Package API" at "https://forge.cptlobster.dev/api/packages/cptlobster/maven"

libraryDependencies += "dev.cptlobster" %% "aggregation-framework-core" % "0.1.0-SNAPSHOT"
// for JSON parsing
libraryDependencies += "dev.cptlobster" %% "aggregation-framework-json" % "0.1.0-SNAPSHOT"

Note: Snapshot versions are available here at forge.cptlobster.dev. Release versions will be made available on Maven Central at a future date.

To create a consumer, follow the tutorial.

Target Artifacts

The project is split into a collection of packages. These are split so that you don't have to install a ton of external packages that you aren't going to use.

The core package is located under /core in this repository, and the extension packages are located under their own subdirectories in /ext. Each extension package has its own README that describes it in more detail.

graph BT
    CORE[aggregation-framework-core]
    JSON[aggregation-framework-json]
    KAFKA[aggregation-framework-kafka]
    SEL[aggregation-framework-selenium]
    RUNNER[aggregation-framework-runner]
    CORE --> JSON & KAFKA & SEL & RUNNER

Development

This project uses sbt for project and dependency management. Install sbt via your preferred package manager; if you use IntelliJ, it can manage sbt for you.

To build the entire project:

sbt compile

License

This program is licensed under the GNU Lesser General Public License, version 3.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU Lesser General Public License (and the GNU General Public License) along with this program. If not, see https://www.gnu.org/licenses/.

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
.bsp		.bsp
.forgejo/workflows		.forgejo/workflows
.idea		.idea
core/src/main/scala		core/src/main/scala
docs		docs
ext		ext
project		project
src/test/scala		src/test/scala
.gitignore		.gitignore
LICENSE_GPL.md		LICENSE_GPL.md
LICENSE_LGPL.md		LICENSE_LGPL.md
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

aggregation-framework

Get Started

Target Artifacts

Development

License

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

aggregation-framework

Get Started

Target Artifacts

Development

License

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages