Abubakkar .
Abubakkar .
Abubakkar .
Technical Skills:
Hadoop/Big Data: HDFS, MapReduce, Hive, Pig, HBase, Sqoop, Pig, Impala, Oozie,
Kafka, Spark, Zookeeper, Storm, Yarn, AWS, AWS S3, AWS EMR, PySpark, Nifi,
AWS Glue.
Java & J2EE Technologies: Core Java, Servlets, JSP, JDBC, JNDI, Java Beans
IDE's: Eclipse, Net beans, IntelliJ
Frameworks: MVC, Struts, Hibernate, Spring
Programming languages: Java, JavaScript, Scala, Python, Unix & Linux shell
scripts and SQL
Databases: Oracle […] MySQL, DB2, Teradata, MS-SQL Server.
Nosql Databases: Hbase, Cassandra, MongoDB
Web Servers: Web Logic, Web Sphere, Apache Tomcat
Web Technologies: HTML, XML, JavaScript, AJAX, SOAP, WSDL
Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP
ETL Tools: Informatica BDM, Talend.
Web Development: HTML, DHTML, XHTML, CSS, Java Script, AJAX
XML/Web Services: XML, XSD, WSDL, SOAP, Apache Axis, DOM, SAX, JAXP, JAXB,
XMLBeans.
Methodologies/Design Patterns: OOAD, OOP, UML , MVC2, DAO, Factory pattern,
Session Facade
Operating Systems: Windows, AIX, Sun Solaris, HP-UX.
BigData/Hadoop developer
PRA Health Sciences- Conshohocken, PA Jan 2014 to
December 2015
Responsibilities:
Imported Data from Different Relational Data Sources like RDBMS, Teradata to
HDFS using Sqoop.
Imported Bulk Data into HBase Using Map Reduce programs and perform
analytics on Time Series Data exists in HBase using HBaseAPI.
Designed and implemented Incremental Imports into Hive tables and used Rest
API to Access HBase data to perform analytics.
Installed/Configured/Maintained Apache Hadoop clusters for application
development and Hadoop tools like Hive, Pig, HBase, Flume, Oozie, Zookeeper
and Sqoop.
Created POC to store Server Log data in MongoDB to identify System Alert
Metrics.
Implemented usage of Amazon EMR for processing Big Data across a Hadoop
Cluster of virtual servers on Amazon Elastic Compute Cloud (EC2) and Amazon
Simple Storage Service (S3).
Importing of data from various data sources, performed transformations using
Hive, MapReduce, and loaded data into HDFS& Extracted the data from MySQL
into HDFS using Sqoop.
Developed MapReduce/Spark Python modules for machine learning & predictive
analytics in Hadoop on AWS.
Worked in Loading and transforming large sets of structured, semi structured and
unstructured data
Involved in collecting, aggregating and moving data from servers to HDFS using
Apache Flume
Implemented end-to-end systems for Data Analytics, Data Automation and
integrated with custom visualization tools using R, Hadoop and MongoDB,
Cassandra.
Involved in Installation and configuration of Cloudera distribution Hadoop,
NameNode, Secondary NameNode, JobTracker, TaskTrackers and DataNodes.
Written Hive jobs to parse the logs and structure them in tabular format to
facilitate effective querying on the log data.
Utilized Spark, Scala, Hadoop, HBase, Kafka, Spark Streaming, MLLib, Python, a
broad variety of machine learning methods including classifications, regressions,
dimensionally reduction etc.
Used S3 Bucket to store the jar's, input datasets and used Dynamo DB to store
the processed output from the input data set.
Worked with Cassandra for non-relational data storage and retrieval on enterprise
use cases and wrote MapReduce jobs using Java API and Pig Latin.
Improving the performance and optimization of existing algorithms in Hadoop
using Spark context, Spark-SQL and Spark YARN.
Involved in creating Data Lake by extracting customer's Big Data from various
data sources into Hadoop HDFS. This included data from Excel, Flat Files, Oracle,
SQL Server, MongoDb, Cassandra, HBase, Teradata, Netezza and also log data
from servers
Doing data synchronization between EC2 and S3, Hive stand-up, and AWS
profiling.
Created reports for the BI team using Sqoop to export data into HDFS and Hive
and involved in creating Hive tables and loading them into dynamic partition
tables.
Involved in managing and reviewing the Hadoop log files and migrated ETL jobs to
Pig scripts to do Transformations, even joins and some pre-aggregations before
storing the data to HDFS.
Worked on Talend ETL tool and used features like context variable and database
components like input to oracle, output to oracle, tFile compare, tFile copy, to
oracle close ETL components.
Worked on NoSQL databases including HBase and MongoDB. Configured MySQL
Database to store Hive metadata.
Deployment and Testing of the system in Hadoop MapR Cluster and worked on
different file formats like Sequence files, XML files and Map files using Map
Reduce Programs.
Developed multiple MapReduce jobs in Java for data cleaning and preprocessing
and imported data from RDBMS environment into HDFS using Sqoop for report
generation and visualization purpose using Tableau.
Developed the ETL mappings using mapplets and re-usable transformations, and
various transformations such as source qualifier, expression, connected and un-
connected lookup, router, aggregator, filter, sequence generator, update strategy,
normalizer, joiner and rank transformations in Power Center Designer.
Worked on Oozie workflow engine for job scheduling and created and maintained
Technical documentation for launching HADOOP Clusters and for executing
PigScripts.
Environment: Hadoop, HDFS, Map Reduce, Hive, HBase, Oozie, Sqoop, Pig, Java,
Tableau, Rest API, Maven, Strom, Kafka, SQL, ETL, AWS, MapR, PySpark, JavaScript,
Shell Scripting.
Java Developer
CoBank, Denver CO July 2010 to October
2012
Responsibilities:
Worked on Java Struts1.0 synchronized with SQL Server Database to develop an
internal application for ticket creation.
Designed and developed GUI using JSP, HTML, DHTML and CSS.
Mapped an internal tool with Service now ticket creation tool.
Wrote Hibernate configuration file, hibernate mapping files and defined
persistence classes to persist the data into Oracle Database.
Developed screens using jQuery, JSP, JavaScript, AJAX, ExtJS, HTML5, CSS,
JavaScript, JQueryand AJAX.
Individually developed Parser logic to decode the Spawn file generated from
Client side and to generate a ticket-based system on Business Requirements.
Used XSL transforms on certain XML data and developed ANTscript for compiling
and deployment and performed unit testing using Junit.
Build SQLqueries for fetching the required data and columns from production
Database.
Implemented MVC Architecture for front end development using spring MVC
Framework.
Implemented User Interface using HTML 5, JSP, CSS 3, Java script/JQuery and
performed validations using Java Script libraries.
Used Tomcat server for deployment and modified Agile/Scrum methodology is
used for development of this application.
Used Web services (SOAP) for transmission of large blocks of XML data over
HTTP.
Highly involved in writing all database related issues with Stored Procedures,
Triggers and tables based on requirements.
Prepared Documentation and User Guides to identify the various Attributes and
Metrics needed from Business.
Handled SVN Version control as code repository and conduct Knowledge Transfer
(KT) sessions on the business value and technical functionalities incorporated in
the developed modules for new recruits.
Created a maintenance plan for production database. Facilitated with Oracle
Certified Java Programmer 6.
Environment: MS Windows 2000, OS/390, J2EE (JSP, Struts, Spring, Hibernate),
Restful, Soap, SQL Server 2005, Eclipse, Tomcat 6, HTML, CSS, JSP, JASON, AJAX,
JUnit, SQL, My SQL.