TP 4 2docuatrimestre

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 10

trabajo practico n°

1)Realiza la traduccion de los titulus y subtitulos.


2)Elige uno de los subtitulos y busca en internet informacion sobre el
mismo (en español). Realiza anotaciones.
3)Traduce las frases resaltadas en negrita. ¿Que verbos modales
encuentras en las mismas?
4)Traduce la seccion que elegiste en el punto 2. Compara las
informaciones.
https://www.ibm.com/topics/big-data-analytics

What is big data analytics?


Big data analytics refers to the systematic processing and analysis of large amounts of
data and complex data sets, known as big data, to extract valuable insights. Big data
analytics allows for the uncovering of trends, patterns and correlations in large amounts of
raw data to help analysts make data-informed decisions. This process allows organizations
to leverage the exponentially growing data generated from diverse sources,
including internet-of-things (IoT) sensors, social media, financial transactions and smart
devices to derive actionable intelligence through advanced analytic techniques.

In the early 2000s, advances in software and hardware capabilities made it possible for
organizations to collect and handle large amounts of unstructured data. With this
explosion of useful data, open-source communities developed big data frameworks to
store and process this data. These frameworks are used for distributed storage and
processing of large data sets across a network of computers. Along with additional tools
and libraries, big data frameworks can be used for:

 Predictive modeling by incorporating artificial intelligence (AI) and statistical


algorithms
 Statistical analysis for in-depth data exploration and to uncover hidden patterns
 What-if analysis to simulate different scenarios and explore potential outcomes
 Processing diverse data sets, including structured, semi-structured and
unstructured data from various sources.
Four main data analysis methods – descriptive, diagnostic, predictive and prescriptive –
are used to uncover insights and patterns within an organization's data. These methods
facilitate a deeper understanding of market trends, customer preferences and other
important business metrics.
Get the 2024 Gartner® Magic Quadrant™ for Augmented Data Quality Solutions report
IBM named a Leader in the 2024 Gartner® Magic Quadrant™ for Augmented Data Quality
Solutions.

Differences between big data and traditional data

The main difference between big data analytics and traditional data analytics is the type of
data handled and the tools used to analyze it. Traditional analytics deals with structured
data, typically stored in relational databases. This type of database helps ensure that data
is well-organized and easy for a computer to understand. Traditional data analytics relies
on statistical methods and tools like structured query language (SQL) for querying
databases.

Big data analytics involves massive amounts of data in various formats, including
structured, semi-structured and unstructured data. The complexity of this data requires
more sophisticated analysis techniques. Big data analytics employs advanced techniques
like machine learning and data mining to extract information from complex data sets. It
often requires distributed processing systems like Hadoop to manage the sheer volume of
data.

Four main data analysis methods


These are the four methods of data analysis at work within big data:
Descriptive analytics
The "what happened" stage of data analysis. Here, the focus is on summarizing and
describing past data to understand its basic characteristics.
Diagnostic analytics
The “why it happened” stage. By delving deep into the data, diagnostic analysis identifies
the root patterns and trends observed in descriptive analytics.
Predictive analytics
The “what will happen” stage. It uses historical data, statistical modeling and machine
learning to forecast trends.
Prescriptive analytics
Describes the “what to do” stage, which goes beyond prediction to provide
recommendations for optimizing future actions based on insights derived from all
previous.

The five V's of big data analytics


The following dimensions highlight the core challenges and opportunities inherent in big
data analytics.
Volume
The sheer volume of data generated today, from social media feeds, IoT devices,
transaction records and more, presents a significant challenge. Traditional data storage
and processing solutions are often inadequate to handle this scale efficiently. Big data
technologies and cloud-based storage solutions enable organizations to store and manage
these vast data sets cost-effectively, protecting valuable data from being discarded due to
storage limitations.
Velocity
Data is being produced at unprecedented speeds, from real-time social media updates to
high-frequency stock trading records. The velocity at which data flows into organizations
requires robust processing capabilities to capture, process and deliver accurate analysis in
near real-time. Stream processing frameworks and in-memory data processing are
designed to handle these rapid data streams and balance supply with demand.
Variety
Today's data comes in many formats, from structured to numeric data in traditional
databases to unstructured text, video and images from diverse sources like social media
and video surveillance. This variety demands flexible data management systems to handle
and integrate disparate data types for comprehensive analysis. NoSQL databases, data
lakes and schema-on-read technologies provide the necessary flexibility to accommodate
the diverse nature of big data.

Veracity
Data reliability and accuracy are critical, as decisions based on inaccurate or incomplete
data can lead to negative outcomes. Veracity refers to the data's trustworthiness,
encompassing data quality, noise and anomaly detection issues. Techniques and tools for
data cleaning, validation and verification are integral to ensuring the integrity of big data,
enabling organizations to make better decisions based on reliable information.
Value
Big data analytics aims to extract actionable insights that offer tangible value. This involves
turning vast data sets into meaningful information that can inform strategic decisions,
uncover new opportunities and drive innovation. Advanced analytics, machine learning
and AI are key to unlocking the value contained within big data, transforming raw data
into strategic assets.

Operationalizing big data analytics

Data professionals, analysts, scientists and statisticians prepare and process data in a data
lake house, which combines the performance of a data warehouse with the flexibility of a
data lake to clean data and ensure its quality. The process of turning raw data into
valuable insights encompasses several key stages:
 Collect data: The first step involves gathering data, which can be a mix of
structured and unstructured forms from myriad sources like cloud, mobile
applications and IoT sensors. This step is where organizations adapt their data
collection strategies and integrate data from varied sources into central
repositories like a data lake, which can automatically assign metadata for better
manageability and accessibility.
 Process data: After being collected, data must be systematically organized,
extracted, transformed and then loaded into a storage system to ensure accurate
analytical outcomes. Processing involves converting raw data into a format that is
usable for analysis, which might involve aggregating data from different sources,
converting data types or organizing data into structure formats. Given the
exponential growth of available data, this stage can be challenging. Processing
strategies may vary between batch processing, which handles large data volumes
over extended periods and stream processing, which deals with smaller real-time
data batches.
 Clean data: Regardless of size, data must be cleaned to ensure quality and
relevance. Cleaning data involves formatting it correctly, removing duplicates and
eliminating irrelevant entries. Clean data prevents the corruption of output and
safeguard’s reliability and accuracy.
 Analyze data: Advanced analytics, such as data mining, predictive analytics,
machine learning and deep learning, are employed to sift through the processed
and cleaned data. These methods allow users to discover patterns, relationships
and trends within the data, providing a solid foundation for informed decision-
making.

Under the Analyze umbrella, there are potentially many technologies at work, including
data mining, which is used to identify patterns and relationships within large data sets;
predictive analytics, which forecasts future trends and opportunities; and deep learning,
which mimics human learning patterns to uncover more abstract ideas.

Deep learning uses an artificial neural network with multiple layers to model complex
patterns in data. Unlike traditional machine learning algorithms, deep learning learns from
images, sound and text without manual help. For big data analytics, this powerful
capability means the volume and complexity of data is not an issue.

Natural language processing (NLP) models allow machines to understand, interpret and
generate human language. Within big data analytics, NLP extracts insights from massive
unstructured text data generated across an organization and beyond.

Types of big data

Structured Data
Structured data refers to highly organized information that is easily searchable and
typically stored in relational databases or spreadsheets. It adheres to a rigid schema,
meaning each data element is clearly defined and accessible in a fixed field within a record
or file. Examples of structured data include:

 Customer names and addresses in a customer relationship management (CRM)


system
 Transactional data in financial records, such as sales figures and account balances
 Employee data in human resources databases, including job titles and salaries

Structured data's main advantage is its simplicity for entry, search and analysis, often
using straightforward database queries like SQL. However, the rapidly expanding universe
of big data means that structured data represents a relatively small portion of the total
data available to organizations.

Unstructured Data

Unstructured data lacks a pre-defined data model, making it more difficult to collect,
process and analyze. It comprises the majority of data generated today, and includes
formats such as:

 Textual content from documents, emails and social media posts


 Multimedia content, including images, audio files and videos
 Data from IoT devices, which can include a mix of sensor data, log files and time-
series data

The primary challenge with unstructured data is its complexity and lack of uniformity,
requiring more sophisticated methods for indexing, searching and analyzing. NLP, machine
learning and advanced analytics platforms are often employed to extract meaningful
insights from unstructured data.

Semi-structured data

Semi-structured data occupies the middle ground between structured and unstructured
data. While it does not reside in a relational database, it contains tags or other markers to
separate semantic elements and enforce hierarchies of records and fields within the data.
Examples include:

 JSON (JavaScript Object Notation) and XML (extensible Markup Language) files,
which are commonly used for web data interchange
 Email, where the data has a standardized format (e.g., headers, subject, body) but
the content within each section is unstructured
 NoSQL databases, can store and manage semi-structured data more efficiently
than traditional relational databases
Semi-structured data is more flexible than structured data but easier to analyze than
unstructured data, providing a balance that is particularly useful in web applications and
data integration tasks.

The benefits of using big data analytics

Ensuring data quality and integrity, integrating disparate data sources, protecting data
privacy and security and finding the right talent to analyze and interpret data can present
challenges to organizations looking to leverage their extensive data volumes. What
follows are the benefits organizations can realize once they see success with big data
analytics:

Real-time intelligence

One of the standout advantages of big data analytics is the capacity to provide real-time
intelligence. Organizations can analyze vast amounts of data as it is generated from
myriad sources and in various formats. Real-time insight allows businesses to make quick
decisions, respond to market changes instantaneously and identify and act on
opportunities as they arise.

Better-informed decisions

With big data analytics, organizations can uncover previously hidden trends, patterns and
correlations. A deeper understanding equips leaders and decision-makers with the
information needed to strategize effectively, enhancing business decision-making in
supply chain management, e-commerce, operations and overall strategic direction.

Cost savings

Big data analytics drives cost savings by identifying business process efficiencies and
optimizations. Organizations can pinpoint wasteful expenditures by analyzing large
datasets, streamlining operations and enhancing productivity. Moreover, predictive
analytics can forecast future trends, allowing companies to allocate resources more
efficiently and avoid costly missteps.

Better customer engagement

Understanding customer needs, behaviors and sentiments is crucial for successful


engagement and big data analytics provides the tools to achieve this understanding.
Companies gain insights into consumer preferences and tailor their marketing strategies
by analyzing customer data.

Optimized risk management strategies


Big data analytics enhances an organization's ability to manage risk by providing the tools
to identify, assess and address threats in real time. Predictive analytics can foresee
potential dangers before they materialize, allowing companies to devise preemptive
strategies.
Careers involving big data analytics

As organizations across industries seek to leverage data to drive decision-making, improve


operational efficiencies and enhance customer experiences, the demand for skilled
professionals in big data analytics has surged. Here are some prominent career paths that
utilize big data analytics:

Data scientist

Data scientists analyze complex digital data to assist businesses in making decisions. Using
their data science training and advanced analytics technologies, including machine
learning and predictive modeling, they uncover hidden insights in data.

Data analyst

Data analysts turn data into information and information into insights. They use statistical
techniques to analyze and extract meaningful trends from data sets, often to inform
business strategy and decisions.

Data engineer

Data engineers prepare, process and manage big data infrastructure and tools. They also
develop, maintain, test and evaluate data solutions within organizations, often working
with massive datasets to assist in analytics projects.

Machine learning engineer

Machine learning engineers focus on designing and implementing machine learning


applications. They develop sophisticated algorithms that learn from and make predictions
on data.

Business intelligence analyst

Business intelligence (BI) analysts help businesses make data-driven decisions by analyzing
data to produce actionable insights. They often use BI tools to convert data into easy-to-
understand reports and visualizations for business stakeholders.

Data visualization specialist


These specialists focus on the visual representation of data. They create data
visualizations that help end users understand the significance of data by placing it in a
visual context.

Data architect

Data architects design, create, deploy and manage an organization's data architecture.
They define how data is stored, consumed, integrated and managed by different data
entities and IT systems.
Big data analytics products

IBM and Cloudera Cloud Data Solutions


IBM and Cloudera have partnered to create an industry-leading, enterprise-grade big data
framework distribution plus a variety of cloud services and products — all designed to
achieve faster analytics at scale.

IBM Db2 Database


IBM Db2 Database on IBM Cloud Pak for Data combines a proven, AI-infused, enterprise-
ready data management system with an integrated data and AI platform built on the
security-rich, scalable Red Hat OpenShift foundation.

IBM Big Replicate


IBM Big Replicate is an enterprise-class data replication software platform that keeps data
consistent in a distributed environment, on-premises and in the hybrid cloud, including
SQL and NoSQL databases.

Respuesta

Traducción de titulus

 análisis de grandes datos

 diferencia entre grandes datos y datos tradicionales

 Cuatro métodos principales de análisis de datos.

 Las cinco V del análisis de grandes datos

 Operacionalizar el análisis de grandes datos

 Tipos de grandes datos

 Los beneficios de utilizar análisis de grandes datos


 Productos de análisis de grandes datos

Traducción de subtítulos

 Análisis descriptivo

 Análisis de diagnóstico

 Análisis predictivo

 Análisis prescriptivo

 Velocidad

 Variedad

 Veracidad

 Valor

 Datos estructurados

 Datos no estructurados

 Datos semiestructurados

 Inteligencia en tiempo real

 Decisiones mejor informadas

 Ahorro de costos

 Mejor compromiso con el cliente

 Estrategias optimizadas de gestión de riesgos

 Científico de datos

 analista de datos

 ingeniero de datos

 Ingeniero de aprendizaje automático


 analista de inteligencia de negocios

 Especialista en visualización de datos.

 Arquitecto de datos

2)

You might also like