aws etl

Bank, building data warehouse in Redshift with visualizations in Power BI

  • store raw data on S3
  • using Terraform the define AWS resources
  • Python and PySpark to code AWS Glue jobs to create and update Redshift tables
  • develop visualizations with Power BI desktop

Insurance, main data source various divers SQL Server databases

  • integrate data sources in SQL server
  • ETL and integrate SQL Server data with non SQL data with Qlik Sense scripting
  • use Qlik Sense Set Analysis when necessary
  • develop visualizations and publish for self serve
qlik sense etl
ETL

Moving to a new Billing platform

  • find out which tables matter
  • find out how tables are connected
  • code ETL in native Progress for migration
  • code ETL for SQL data warehouse
  • code ETL for data not connected to system

IBM purchased a company. The previous SAP download to an application had to be replaced with data from the various sources

  • Mapping data stream from various IBM sources to the SAP data
  • Identify the most important elements
  • eliminating data and calculations that were no longer needed or important
  • simplifying the system as much as possible
  • Loading and cleaning incoming data
  • write front end routines that converted the new data to the write format
  • minimally change the existing application
etl ibm