Sahith - Sr. Data Engineer
Sahith - Sr. Data Engineer
Sahith - Sr. Data Engineer
PROFESSIONAL SUMMARY
Over 10+ of hands-on expertise as a Senior Data Engineer specializing in TECHNICAL SKILLS
Database Development, ETL Development, Data Modeling, Report Cloud Platforms: Amazon web Services
Development, and Big Data Technologies. (AWS), Microsoft Azure, Google Cloud
Proficient in programming languages like Python (Pandas, NumPy, PySpark, Platform (GCP)
scikit-learn, PyTorch), SQL (including PL/SQL for Oracle), Scala, and Big Data Processing and Analytics:
PowerShell. Apache Spark, Apache Airflow, Hadoop,
Extensive experience with cloud platforms including AWS (S3, Redshift, RDS, Hive, Sqoop, Kafka, Impala, Apache
DynamoDB, EMR, Glue, Data Pipeline, Kinesis, Athena, QuickSight, Lambda, Beam
CloudFormation, CodePipeline), Azure (ADF, SQL Server, Cosmos DB, Programming and Scripting: Python
Databricks, HDInsight, Blob Storage, Data Lake Storage), and Google Cloud (Pandas, NumPy, PySpark, scikit-learn,
Platform (BigQuery, Dataflow, Dataproc, Pub/Sub, Cloud Storage, Cloud SQL, PyTorch), SQL (including PL/SQL for
Cloud Datastore, Cloud Pub/Sub, Apache Beam). Oracle), Scala, PowerShell
Proficient in Apache Spark, Apache Airflow, Hadoop, Hive, Sqoop, Kafka, Data Integration and ETL Tools: AWS
Impala, and Apache Beam for large-scale data processing and analytics. Glue, AWS Data Pipeline, Informatica,
Experience with cloud data warehouses like Redshift, Snowflake, and Talend, SSIS
BigQuery for scalable data storage and retrieval. Containerization and Orchestration:
Skilled in data visualization tools such as Tableau, Power BI, Google Data Docker, Kubernetes
Studio, and QuickSight for creating insightful reports and dashboards. Version Control and Collaboration: Git,
Hands-on experience with data integration tools like Informatica, Talend, and GitHub, Bitbucket
SSIS for seamless data flow across systems. CI/CD: Azure DevOps, Jenkins, AWS
In-depth knowledge of various database systems including SQL Server, CodePipeline
Cosmos DB, Oracle, PostgreSQL, Cassandra, MySQL, and DynamoDB for Data Warehousing and Database
efficient data storage and retrieval. Management: Redshift, Snowflake,
Proficient in handling data formats like JSON, XML, and Avro for data BigQuery, SQL Server, Cosmos DB,
interchange and storage. Oracle, PostgreSQL, Cassandra, MySQL,
Familiarity with containerization technologies like Docker and orchestration DynamoDB
tools like Kubernetes for scalable and manageable deployments. Data Visualization and BI Tools:
Experience with CI/CD pipelines using Azure DevOps, Jenkins, and AWS Tableau, Power BI, Google Data Studio,
CodePipeline for automated software delivery and deployment. QuickSight
Proficient in Excel Advanced functions, pivot tables, and V Lookups for data Security and Access Control: AWS IAM
analysis and reporting. and AWS KMS, Azure Key Vault and
Hands-on experience with AWS Glue, AWS Data Pipeline for ETL workflows Azure AD, SSL/TLS, AES encryption
and data processing. standards
Familiarity with version control systems like Git, GitHub, and Bitbucket for Miscellaneous Tools and Technologies:
collaborative development and code management. JSON, Excel Advanced functions, pivot
Strong understanding of security and access control principles including AWS tables, V Lookups, Bugzilla, Confluence,
IAM, AWS KMS, Azure Key Vault, Azure AD, SSL/TLS, and AES encryption SharePoint, JIRA, Agile, Scrum, Kanban
standards. Operating Systems: Window, Linux,
Proficient in project management tools like Bugzilla, Confluence, SharePoint, UNIX, macOS
JIRA, Agile, Scrum, and Kanban for efficient project execution and
collaboration. EDUCATION
Analyzed business needs, designed efficient processes, and managed Masters
development teams to deliver successful projects. Bachelors
Implemented data pipelines, ensured system integration, and adapted to new
technologies for continuous improvement.
Collaborated with stakeholders at all levels to align project goals and ensure
informed decision-making.
Tackled complex data challenges with strong analytical skills and a drive to
contribute to a dynamic and innovative environment.
WORK EXPERIENCE
First America, CA
Data Engineer | Nov 2020 - Sep 2022
Designed and implemented data integration workflows using Azure Data Factory (ADF), ensuring seamless data
movement and transformation across on-premises and cloud environments.
Managed and optimized SQL Server databases, ensuring data integrity, performance, and availability for business
operations.
Implemented Azure Cosmos DB for globally distributed and scalable NoSQL database solutions, ensuring high availability
and low-latency data access.
Utilized Snowflake for cloud-based data warehousing, enabling scalable and flexible analytics solutions.
Implemented data processing and analytics workflows using Azure Databricks, leveraging Apache Spark for distributed
computing and machine learning.
Managed and optimized big data clusters using Azure HDInsight, ensuring efficient data processing and analytics
capabilities.
Utilized Azure Blob Storage and Azure Data Lake Storage for scalable and cost-effective data storage solutions.
Automated tasks and workflows using PowerShell, streamlining data management and operations.
Applied Python with Pandas, NumPy, and PyTorch for data manipulation, analysis, and machine learning model
development, enhancing data processing capabilities.
Implemented data processing pipelines using Spark, handling large-scale data processing and analytics tasks.
Developed serverless functions using Azure Functions, enabling event-driven data processing and automation.
Managed and optimized Hadoop clusters for distributed data processing and analytics, ensuring scalability and
performance.
Implemented Kafka for real-time data streaming and processing, enabling real-time analytics and event-driven
architectures.
Managed secrets and access control using Azure Key Vault and Azure Active Directory (Azure AD), ensuring data security
and compliance.
Processed and analyzed JSON data formats, enabling structured data processing and integration.
Managed code repositories and collaborated with teams using Bitbucket, ensuring version control and code quality.
Utilized Impala for interactive SQL queries and analytics on Hadoop-based data platforms.
Implemented continuous integration and continuous deployment (CI/CD) pipelines using Azure DevOps, ensuring
automated and reliable software delivery.
Monitored and managed Azure resources using Azure Monitor and Azure Log Analytics, ensuring performance
optimization and troubleshooting.
Automated infrastructure deployment and management using Terraform, ensuring consistent and scalable infrastructure
configurations.
Containerized applications and services using Docker, enabling scalable and portable deployment of data solutions.
Orchestrated containerized applications using Kubernetes, ensuring efficient management and scaling of containerized
workloads.
Developed and deployed interactive data visualizations using Power BI, enabling data-driven insights and decision-
making.
Contributed to Agile methodologies, participating in Scrum ceremonies and sprint planning to deliver data solutions
iteratively and efficiently.
Managed project workflows and tasks using JIRA, ensuring collaboration and alignment with project goals and timelines.
Tech Stack: ADF, SQL Server, Azure Cosmos DB, Snowflake, Azure Databricks, Azure HDInsight, Azure Blob Storage, Azure
Data Lake Storage, PowerShell, Python, Spark, Azure Functions, Hadoop, Kafka, JSON, Bitbucket, Impala, Azure DevOps,
Azure Monitor, Terraform, Docker, Kubernetes, Power BI, JIRA.