The Data Engineer Senior is responsible for performing advanced data engineering work using Python and related tools. The incumbent will support creating reliable pipelines, combine data sources, architect data stores, assist in architecting distributed systems and support data management platforms. This position will proactively work with the Data Engineering Team, Data Architect Team, IT staff, and agency employees.
This is a Hybrid position and requires 2 days of onsite work a week conducted at our TRS office in Austin TX
WHAT WILL YOU DO:
Data Engineering
• Develop and maintain data pipelines and ETL processes using Python.
• Design, develop, and implement Python applications to support business requirements.
• Design and implement scalable data processing solutions to handle large volumes of data.
• Leads collaboration with analysts, and other stakeholders to understand data requirements and develop solutions to support their needs.
• Incorporate external data sources and APIs to enrich and expand the organization's data assets.
• Utilize extensive knowledge of Python, SQL, and REST APIs to improve current applications and build new ones.
• Drive bug fixing and improve application performance.
• Develop solutions using data science techniques and tools.
• Ensure data security, privacy, and compliance with regulations by applying data engineering best practices.
• Develop and maintain data documentation, including data dictionaries, data lineage, and metadata management.
• Continuously discover, evaluate, and implement new technologies to maximize development efficiency.
• Leverage version control systems such as Git to manage the codebase.
• Adheres to established practices for data pipeline continuous integration and continuous delivery (CI/CD) processes.
• Collaborates in the identification and integration of potential new data sources.
• Performs related work as assigned.
WHAT WILL YOU BRING:
Required Education
• Bachelor’s degree from an accredited four-year college or university in computer science, software engineering, information technology, data analytics, or a related field.
• High school diploma or equivalent and additional full-time experience in data analytics python programming, building data pipelines, using analytics and ETL tools may substitute for the required education on an equivalent year-for-year basis.
Required Experience
• Five (5) years of full-time directly related, progressively responsible experience in data analytics, building data pipelines, using analytics and ETL tools, or related experience.
• Two (2) years of full-time directly related, progressively responsible experience with Python programming, including hands-on management of Python libraries and environments in support of Python projects.
• Experience may be concurrent.
• A master's degree or doctoral degree in a closely related field may be substituted on an equivalent year-for-year basis.
Preferred Qualifications
• Expert-level development skills in Python.
• Experience working with architects to design data warehouses, data lakes, data pipelines; warehouse, and pipeline operations (Azure Data Factory, Synapse and Databricks)
• A proven track record of architecting complex systems to work efficiently and reliably in mission- critical applications.
• Experience with quantitative research processes, methodologies, and tools is a plus.
• Experience with Financial Data.
• Experience with constraint optimization.
• Experience building complex data pipelines / ETL.
• Experience with GraphQL.
Knowledge, Skills, and Abilities
Knowledge of:
• Data management disciplines: data architecture, data warehousing, data integration and interoperability, data modeling (including dimensional), and data storage and operations.
• Process management, and metrics management.
• Commonly used data and analytics technologies (i.e. Azure Synapse Analytics, SQL Server, etc.)
• Relational and non-relational data structures, theories, principles, and practices.
• Metadata management and associated processes.
• Web services (REST, SOAP, XML, WSDL, JSON).
• Python programming, application frameworks (Django, Flask, Pyramid, Tornado), testing, and code analysis tools (Pytest, Pylint).
• ETL Tool (SSIS, Talend, Informatica).
• Agile development & DevOps approach to maintaining pipelines and databases.
Skills in:
• Using technical and statistical analysis to deliver clear, concise, and visually appealing management metrics and reports to inform decision making and actions.
• Writing report queries, including an understanding of relational databases.
• Completing detailed work with a high degree of accuracy.
• Planning, organizing, and prioritizing work assignments to manage a high-volume workload in a fast-paced and changing environment.
• Using a computer in a Windows environment with Microsoft Office word processing, spreadsheet, and other business software.
• Effective written and verbal communications, including explaining complex information to others in an understandable manner, and writing clear and precise policies, procedures, and training or other materials.
Ability to:
• Operate effectively in a fast-paced environment with competing and shifting priorities.
• Continually learn new concepts and tools to support job needs, and strive for ongoing professional development.
• Establish and maintain harmonious working relationships with co-workers, agency staff, and external contacts.
• Work effectively in a professional team environment.