Data Engineering

  • ETL
  • data modelling
  • data cleaning

Related Projects

  • Python Developer, Frontend and Backend
    2023 University Cologne (CECAD Imaging Facility)

    Development of an exporter for OMERO, a database for biomedical image data

    Tasks:
    • Extension of the open source command line tool omero-cli-transfer to transfer research data from the OMERO database to ARC repositories.
    • Development of a mapping specification for transferring OMERO projects to ARC repositories.
    • Extension of the OMERO web frontend to display ARC metadata and control data export.
    • Documentation

    Tools:
    PythonOMEROPostgresDjangoGitGithub ActionsDockerPytest

  • Python Developer / Data Engineer
    2023 DekaBank

    Frontend and backend development of a web application for portfolio management.

    Tasks:
    • Frontend and Backend development with Python and Javascript
    • Design of the class-based software architecture in the backend
    • Data model design (combination of relational and json based model), implementation of test, staging and production database
    • Documentation
    • Setup of continuous integration pipeline (package installation, unit testing, pep8 checks, automated builds of documentation)
    • Work within an interdisciplinary team of financial experts, software developers and analysts

    Tools:
    PythonPlotly DashPydanticMypySphinxGitlabPandasMssqlPytestJavascript

  • Python Developer / Data Engineer
    2022 DekaBank

    Frontend and backend development of a business intelligence web application.

    Tasks:
    • Frontend and Backend development
    • Development of data buffering solutions for fast data provision of fragmented data sources.
    • Handling and effective provisioning of big data tables.
    • Implementation of a business intelligence web app
    • Refactoring prototype scripts to production code (unit tests, continuous integration…)
    • Work within an interdisciplinary team of financial experts, software developers and analysts

    Tools:
    PythonPydanticMypyPlotly DashFlaskGitlabPandasMssqlPytestMLflowParquetJavascript

  • Data Scientist and Data Engineer
    2021 Medium-sized Trading and Logistics Company

    Development of AI based sales prediction models.

    Design and implementation of an AI-supported prediction model for sales of print media in magazine distribution, prototyping, implementation and deployment of the productive system.

    Tasks:
    • Conception of a deep learning model for the prediction of sales figures
    • Development of a data model and ETL processes for processing raw data
    • Implementation of the automated prediction service based on an AI model

    Tools:
    TensorflowPostgresMS SQL ServerPythonSqlalchemyAlembicDockerdocker-composegit

  • Data Engineer / Project Lead
    2020 Research Institute

    Design and implementation of a Postgres database for data management of an automation system.

    An automation plant produces sensor data of various types, which are fed into mathematical prediction models together with various metadata. Measurement data and metadata are to be stored centrally in an SQL database. The project requires close interaction with employees who operate the automation system and evaluate the data. The interdisciplinary team includes biologists, chemists, technicians, data analysts and software developers.

    Tasks:
    • Project management
    • Development of the data model in numerous workshops.
    • Development of import specifications in close cooperation with future users.
    • Implementation of the model in Python/Sqlalchemy
    • Setting up a Postgres test database with docker-compose and Gitlab-CI
    • Implementation of importer tools in Python.

    Tools:
    PostgresPythonSqlalchemyDockerdocker-composeGitlab-CI