“You Live The Way You Think”
1 . D im e n sion M ode llin g t ype s a lon g w it h t h e ir sign
Dat a Modeling1) E- R Diagram s2) Dim ensional m odeling 2.a) logical m odeling
2.b) Physical m odeling
2 . W h a t is t h e flow of loa ding da t a in t o fa ct & dim e nsiona l t a ble s
Here is t he sequence of loading a dat awarehouse.
•
•
•
The source dat a is first loading int o t he st aging area, where dat a cleansing
t akes place.
The dat a from st aging area is t hen loaded int o dim ensions/ lookups.
Finally t he Fact t ables are loaded from t he corresponding source t ables from
t he st aging area.
3 . Or ch e st r a t e Vs D a t a st a ge Pa r a lle l Ex t e n de r ?
Orchest rat e it self is an ETL t ool wit h ext ensive parallel processing capabilit ies and
running on UNI X plat form . Dat ast age used Orchest rat e wit h Dat ast age XE ( Bet a
version of 6.0) t o incorporat e t he parallel processing capabilit ies. Now Dat ast age has
purchased Orchest rat e and int egrat ed it wit h Dat ast age XE and released a new
version Dat ast age 6.0 i.e Parallel Ext ender.
5 . H ow do you e x e cut e da t a st a ge j ob fr om com m a n d lin e ...
Using " dsj ob" com m and as follows.
dsj ob - run - j obst at us proj ect nam e j obnam e
6 . W h a t a r e St a ge Va r ia ble s, D e r iva t ion s a n d Con st r a in t ...
St age Variable - An int erm ediat e processing variable t hat ret ains value during read
and doesn’t pass t he value int o t arget colum n.
Derivat ion - Expression t hat specifies value t o be passed on t o t he t arget colum n.
Const raint - Condit ions t hat are eit her t rue or false t hat specifies flow of dat a wit h a
link.
7 . W h a t is t h e de fa u lt ca ch e size ? H ow do you ch a n ge t h e ca che size if
n e e de d?
Default cache size is 256 MB. We can incraese it by going int o Dat ast age
Adm inist rat or and select ing t he Tunable Tab and specify t he cache size over t here.
8 . Con t a ine r s : Usa ge a nd Type s?
Cont ainer is a collect ion of st ages used for t he purpose of Reusabilit y. There are 2
t ypes of Cont ainers. a) Local Cont ainer: Job Specific b) Shared Cont ainer: Used in
any j ob wit hin a proj ect . · There are t wo t ypes of shared cont ainer: · 1.Server shared
cont ainer. Used in server j obs ( can also be used in parallel j obs) .· 2.Parallel shared
cont ainer. Used in parallel j obs. You can also include server shared cont ainers in
parallel j obs as a way of incorporat ing server j ob funct ionalit y int o a parallel st age
( for exam ple, you could use one t o m ake a server plug- in st age available t o a
parallel j ob) .
1 6 . Differentiate Database data and Data warehouse data?
By Dat abase, one m eans OLTP ( On Line Transact ion Processing) . This can be t he
source syst em s or t he ODS ( Operat ional Dat a St ore) , which cont ains t he
t ransact ional dat a.
Or Dat abase dat a is in t he form of OLTP and Dat a warehouse dat a will be in t he form
of OLAP. OLTP is for t ransact ional process and OLAP is for Analysis purpose
17. W h a t a r e t h e difficu lt ie s fa ce d in using D a t a St a ge ? or w ha t a r e t he
con st r a in t s in u sing D a t a St a ge
1.I feel, t he m ost difficult part is underst anding t he " Dat a st age direct or j ob log error
m essages'. I t doesn't give u in proper readable m essage.
2.We don’t have m any dat e funct ions available like in I nform at ica or t radit ional
Relat ional dat abases.
3. Dat a st age is like unique product int erim s of funct ions ex: Most of t he dat abase or
ETL t ools use for convert ing from lower case t o upper case : UPPER. The dat a st age
uses " UCASE" . Dat a st age is peculiar when we com pare t o ot her ETL t ools.
Or
1) I f t he num ber of lookups are m ore?
2) what will happen, while loading t he dat a due t o som e regions j ob abort s?
1 8 . W h a t r XM L file s a n d h ow do you r e a d da t a fr om XM L file s a nd w h a t
st a ge t o be u se d?
First , u can use XML m et adat a im port er t o im port t he XML source definit ion.Once it is
done. U can use XML input t o read t he XML docum ent . For each and every elem ent
of XML , we should give t he XPATH expression in t he XML input .
XML st age docum ent clearly explanins t his.
1 9 . W h y do you u se SQL LOAD ER or OCI STAGE?
When t he source dat a is anorm ous
or
for bulk dat a we can use OCI and SQL loader depending upon t he source
2 0 . Su ppose if t h e r e a r e m illion r e cor ds did you u se OCI ? if not t h e n w ha t
st a ge do you pr e fe r ?
using Orabulk
2 1 .H ow do you popu la t e sou r ce file s?
t here are m any ways t o populat e one is writ t ing SQL st at m ent in oracle is one way
2 2 .H ow do you pa ss t h e pa r a m e t e r t o t h e j ob se que nce if t he j ob is r u nn ing
a t n igh t ?
Two ways
1. St e t he default values of Param et ers in t he Job Sequencer and m ap t hese
param et ers t o j ob.
2. Run t he j ob in t he sequencer using dsj obs ut ilit y where we can specify t he values
t o be t aken for each param et er
2 3 . W h a t h a ppe ns if t h e j ob fa ils a t n igh t ?
Job Sequence Abort I f you are oncall, u will be called t o fix and rerun t he j ob.
2 4 . W h a t is SQL t un in g? how do you do it ?
sql t unning can be done using cost based opt im izat ion
t his param et ers are very im port ant of pfile sort _area_size ,
sort _area_ret ained_size,db_m ult i_block_count ,open_cursors,cursor_sharing
opt im izer_m ode= choose/ role.
2 5 . W h a t is pr oj e ct life cycle a nd h ow do you im ple m e n t it ?
No Ans
2 6 .H ow do you t r a ck pe r for m a nce st a t ist ics a nd e n ha n ce it ?
Through Monit or we can view t he perform ance st at ist ics
2 7 .H ow do you do or a cle 4 w a y in n e r j oin if t h e r e a r e 4 or a cle in pu t file s?
The Quest ion asked incorrect ly.
t here wont be any Oracle file. I t is Oracle t able or view obj ect .
I never heard about Oracle input file.
Can you please explain what your act ual quest ion is?
2 8 .W h a t is t h e or de r of e x e cu t ion done int e r na lly in t h e t r a n sfor m e r w it h
t h e st a ge e dit or ha vin g input
St age variables, const raint s and colum n derivat ion or expressions.
There is only one Prim ary input link t o t he Transform er and t here can be m any
reference input links and t here can be m any out put links. U can out put t o m ult iple
out put links by defining const raint s on t he out put links.
U can edit t he order of t he input and out put links from t he Link ordering t ab in t he
t ransform er st age propert ies dialog.
2 9 .W h a t a r e t h e oft e n u se d St a ge s or st a ge s you w or k e d w it h in you r la st
pr oj e ct ?
A) Transform er, ORAOCI 8/ 9, ODBC, Link- Part it ioner, Link- Collect or, Hash, ODBC,
Aggregat or, Sort .
3 0 . H ow m a n y j obs h a ve you cr e a t e d in your la st pr oj e ct ?
100+ j obs for every 6 m ont hs if you are in Developm ent , if you are in t est ing 40 j obs
for every 6 m ont hs alt hough it need not be t he sam e num ber for everybody
3 1 .Te ll m e t h e e nvir on m e n t in you r la st pr oj e ct s
Give t he OS of t he Server and t he OS of t he Client of your recent m ost proj ect
3 2 .D id you Pa r a m e t e r ize t he j ob or h a r d- code d t h e va lu e s in t h e j obs?
Always param et erized t he j ob. Eit her t he values are com ing from Job Propert ies or
from a ‘Param et er Manager’ – a t hird part t ool. There is no way you will hard–code
som e param et ers in your j obs. The oft en Param et erized variables in a j ob are: DB
DSN nam e, usernam e, password, dat es W.R.T for t he dat a t o be looked against at .
3 3 .H a ve you e ve r involve d in u pda t ing t h e D S ve r sion s lik e D S 5 .X, if so t e ll
u s som e t h e st e ps you ha ve t a k e n in doin g so?
Yes. The following are som e of t he st eps; I have t aken in doing so:
1) Definit ely t ake a back up of t he whole proj ect ( s) by export ing t he proj ect as a
.dsx file
2) See t hat you are using t he sam e parent folder for t he new version also for your
old j obs using t he hard- coded file pat h t o work.
3) Aft er inst alling t he new version im port t he old proj ect ( s) and you have t o com pile
t hem all again. You can use 'Com pile All' t ool for t his.
4) Make sure t hat all your DB DSN's are creat ed wit h t he sam e nam e as old one's.
This st ep is for m oving DS from one m achine t o anot her.
5) I n case if you are j ust upgrading your DB from Oracle 8i t o Oracle 9i t here is t ool
on DS CD t hat can do t his for you.
6) Do not st op t he 6.0 server before t he upgrade, version 7.0
inst all process collect s proj ect inform at ion during t he upgrade. There is NO rework
( recom pilat ion of exist ing j obs/ rout ines) needed aft er t he upgrade.
3 4 .W h a t is H a sh file st a ge a n d w ha t is it u se d for ?
Used for Look- ups. I t is like a reference t able. I t is also used in- place of ODBC, OCI
t ables for bet t er perform ance.
We can also use t he Hash File st age t o avoid / rem ove dupilcat e rowsby specifying
t he hash key on a part icular fileld
3 5 .W h a t a r e St a t ic H a sh file s a n d D yna m ic H a sh file s?
As t he nam es it self suggest what t hey m ean. I n general we use Type- 30 dynam ic
Hash files. The Dat a file has a default size of 2Gb and t he overflow file is used if t he
dat a exceeds t he 2GB size.
The hashed files have t he default size est ablished by t heir m odulus and separat ion
when you creat e t hem , and t his can be st at ic or dynam ic.Overflow space is only used
when dat a grows over t he reserved size for som eone of t he groups ( sect ors) wit hin
t he file. There are m any groups as t he specified by t he m odulus.
4 0 .Ex pla in t h e diffe r e n ce s be t w e e n Or a cle 8 i/ 9 i?
m ut liproceesing,dat abases m ore dim esnionsal m odeling
4 1 .D o you k n ow a bou t I N TEGRI TY/ QUALI TY st a ge ?
int egriry/ qualit y st age is a dat a int egrat ion t ool from ascent ial which is used t o
st aderdize/ int egrat e t he dat a from different sources
////////
Qulait y St age can be int egrat ed wit h Dat aSt age, I n Qualit y St age we have m any
st ages like invest igat e, m at ch, survivorship like t hat so t hat we can do t he Qualit y
relat ed works and we can int egrat e wit h dat ast age we need Qualit y st age plugin t o
achieve t he t ask.
4 2 .D o u k now a bou t M ETASTAGE?
in sim ple t erm s m et adat a is dat a about dat a and m et ast ge can be anyt hing like
DS( dat aset ,sq file.et c)
Met aSt age is used t o handle t he Met adat a which will be very useful for dat a lineage
and dat a analysis lat er on. Met a Dat a defines t he t ype of dat a we are handling. This
Dat a Definit ions are st ored in reposit ory and can be accessed wit h t he use of
Met aSt age.
4 4 .W h a t a r e OConv ( ) a nd I conv ( ) fun ct ion s a n d w h e r e a r e t h e y use d?
I Conv( ) - Convert s a st ring t o an int ernal st orage form at
OConv( ) - Convert s an expression t o an out put form at .
iconv is used t o convert t he dat e int o int o int ernal form at i.e only dat ast age can
underst and exam ple : - dat e com m ing in m m / dd/ yyyy form at dat asat ge will conver
t his ur dat e int o som e num ber like : - 740
u can use t his 740 in derive in ur own form at by using oconv.
suppose u want t o change m m / dd/ yyyy t o dd/ m m / yyyy
now u will use iconv and oconv.
ocnv( iconv( dat ecom m ingfrom i/ pst ring,SOMEXYZ( seein help which is
iconvform at ) ,defineoconvform at ) )
4 5 .W h a t a r e Rou t ine s a nd w h e r e / h ow a r e t h e y w r it t e n a n d ha ve you
w r it t e n a ny r ou t in e s be for e ? & n b
Rout ines are st ored in t he Rout ines branch of t he Dat aSt age Reposit ory, where you
can creat e, view or edit . The following are different t ypes of rout ines:
1) Transform funct ions
2) Before- aft er j ob subrout ines
3) Job Cont rol rout ines
4 7 .H ow did you ha ndle r e j e ct da t a ?
Typically a Rej ect - link is defined and t he rej ect ed dat a is loaded back int o dat a
warehouse. So Rej ect link has t o be defined every Out put link you wish t o collect
rej ect ed dat a. Rej ect ed dat a is t ypically bad dat a like duplicat es of Prim ary keys or
null- rows where dat a is expect ed.
4 8 . W ha t a r e ot he r Pe r for m a n ce t u n ings you h a ve don e in your la st pr oj e ct
t o in cr e a se t h e pe r for m a n ce of slow ly r un n in g j obs?
St aged t he dat a com ing from ODBC/ OCI / DB2UDB st ages or any dat abase on t he
server using Hash/ Sequent ial files for opt im um perform ance also for dat a recovery in
case j ob abort s.
1. Tuned t he OCI st age for 'Array Size' and 'Rows per Transact ion' num erical
values for fast er insert s, updat es and select s.
2. Tuned t he 'Proj ect Tunables' in Adm inist rat or for bet t er perform ance.
3. Used sort ed dat a for Aggregat or.
4. Sort ed t he dat a as m uch as possible in DB and reduced t he use of DS- Sort for
bet t er perform ance of j obs
5. Rem oved t he dat a not used from t he source as early as possible in t he j ob.
6. Worked wit h DB- adm in t o creat e appropriat e I ndexes on t ables for bet t er
perform ance of DS queries
7. Convert ed som e of t he com plex j oins/ business in DS t o St ored Procedures on
DS for fast er execut ion of t he j obs.
8. I f an input file has an excessive num ber of rows and can be split - up t hen use
st andard logic t o run j obs in parallel.
9. Before writ ing a rout ine or a t ransform , m ake sure t hat t here is not t he
funct ionalit y required in one of t he st andard rout ines supplied in t he sdk or ds
ut ilit ies cat egories.
Const raint s are generally CPU int ensive and t ake a significant am ount of t im e
t o process. This m ay be t he case if t he const raint calls rout ines or ext ernal
m acros but if it is inline code t hen t he overhead will be m inim al.
10. Try t o have t he const raint s in t he 'Select ion' crit eria of t he j obs it self. This will
elim inat e t he unnecessary records even get t ing in before j oins are m ade.
11. Tuning should occur on a j ob- by- j ob basis.
12. Use t he power of DBMS.
13. Try not t o use a sort st age when you can use an ORDER BY clause in t he
dat abase.
14. Using a const raint t o filt er a record set is m uch slower t han perform ing a
SELECT … WHERE….
15. Make every at t em pt t o use t he bulk loader for your part icular dat abase. Bulk
loaders are generally fast er t han using ODBC or OLE.
4 9 .H ow did you ha ndle a n 'Abor t e d' se qu e n ce r ?
I n alm ost all cases we have t o delet e t he dat a insert ed by t his from DB m anually and
fix t he j ob and t hen run t he j ob again.
5 0 .W h a t a r e Se que nce r s?
Sequencers are j ob cont rol program s t hat execut e ot her j obs wit h preset Job
param et ers.
A sequencer allows you t o synchronize t he cont rol flow of m ult iple act ivit ies in a j ob
sequence. I t can have m ult iple input t riggers as well as m ult iple out put t riggers.The
sequencer operat es in t wo m odes: ALL m ode. I n t his m ode all of t he input s t o t he
sequencer m ust be TRUE for any of t he sequencer out put s t o fire.ANY m ode. I n t his
m ode, out put t riggers can be fired if any of t he sequencer input s are
TRUEregardsj agan
5 1 .H ow did u con n e ct w it h D B2 in you r la st pr oj e ct ?
Most of t he t im es t he dat a was sent t o us in t he form of flat files. The dat a is
dum ped and sent t o us. I n som e cases were we need t o connect t o DB2 for look- ups
as an inst ance t hen we used ODBC drivers t o connect t o DB2 ( or) DB2- UDB
depending t he sit uat ion and availabilit y. Cert ainly DB2- UDB is bet t er in t erm s of
perform ance as you know t he nat ive drivers are always bet t er t han ODBC drivers.
'iSeries Access ODBC Driver 9.00.02.02' - ODBC drivers t o connect t o AS400/ DB2.
5 2 .Re a d t he St r in g fun ct ion s in D S
Funct ions like [ ] - > sub- st ring funct ion and ': ' - > concat enat ion operat or
Synt ax: st ring [ [ st art , ] lengt h ]
st ring [ delim it er, inst ance, repeat s ]
5 4 .W ha t w ill you in a sit u a t ion w h e r e som e body w a n t s t o se n d you a file
a n d u se t ha t file a s a n in pu t or r e fe r e n ce a n d t h e n r u n j ob.
A. Under Windows: Use t he 'Wait ForFileAct ivit y' under t he Sequencers and t hen run
t he j ob. May be you can schedule t he sequencer around t he t im e t he file is expect ed
t o arrive.
B. Under UNI X: Poll for t he file. Once t he file has st art t he j ob or sequencer
depending on t he file
5 5 .H ow w ou ld ca ll a n e x t e r na l Ja va fun ct ion w h ich a r e n ot su ppor t e d by
D a t a St a ge ?
St art ing from DS 6.0 we have t he abilit y t o call ext ernal Java funct ions using a Java
package from Ascent ial. I n t his case we can even use t he com m and line t o invoke
t he Java funct ion and writ e t he ret urn values from t he Java program ( if any) and use
t hat files as a source in Dat aSt age j ob
5 6 .W h a t is t h e u t ilit y you u se t o sch e du le t h e j obs on a UN I X se r ve r ot h e r
t h a n u sing Asce n t ia l D ir e ct or
Use cront ab ut ilit y along wit h dsexecut e( ) funct ion along wit h proper param et ers
passed.
5 7 .W h a t a r e t h e com m a n d line fu n ct ion s t h a t im por t a n d e x por t t h e D S
j obs?
A. dsim port .exe- im port s t he Dat aSt age com ponent s.
B. dsexport .exe- export s t he Dat aSt age com ponent s.
5 8 .H ow w ill you de t e r m in e t h e se que n ce of j obs t o loa d in t o da t a
w a r e hou se ?
First we execut e t he j obs t hat load t he dat a int o Dim ension t ables, t hen Fact t ables,
t hen load t he Aggregat or t ables ( if any) .
6 0 .The a bove m igh t r ise a n ot he r que st ion : W h y do w e ha ve t o loa d t h e
dim e n siona l t a ble s fir st , t h e n fa ct t a ble s:
As we load t he dim ensional t ables t he keys ( prim ary) are generat ed and t hese keys
( prim ary) are Foreign keys in Fact t ables.
6 1 .Te ll m e on e sit u a t ion fr om you r la st pr oj e ct , w h e r e you ha d fa ce d
pr oble m a n d H ow did u solve it
A. The j obs in which dat a is read direct ly from OCI st ages are running ext rem ely
slow. I had t o st age t he dat a before sending t o t he t ransform er t o m ake t he j obs run
fast er.
B. The j ob abort s in t he m iddle of loading som e 500,000 rows. Have an opt ion eit her
cleaning/ delet ing t he loaded dat a and t hen run t he fixed j ob or run t he j ob again
from t he row t he j ob has abort ed. To m ake sure t he load is proper we opt ed t he
form er
6 2 . D oe s t h e se le ct ion of 'Cle a r t h e t a ble a n d I n se r t r ow s' in t he OD BC st a ge
se n d a Tr un ca t e st a t e m e n t t o t h e D B or doe s it do som e k ind of D e le t e
logic.
There is no TRUNCATE on ODBC st ages. I t is Clear t able blah blah and t hat is a
delet e from st at em ent . On an OCI st age such as Oracle, you do have bot h Clear and
Truncat e opt ions. They are radically different in perm issions ( Truncat e requires you
t o have alt er t able perm issions where Delet e doesn't ) .
6 3 ..H ow do you r e n a m e a ll of t h e j obs t o su ppor t you r n e w File - n a m ing
conve nt ions?
Creat e a Excel spreadsheet wit h new and old nam es. Export t he whole proj ect as a
dsx. Writ e a Perl program , which can do a sim ple renam e of t he st rings looking up
t he Excel file. Then im port t he new dsx file probably int o a new proj ect for t est ing.
Recom pile all j obs. Be caut ious t hat t he nam e of t he j obs has also been changed in
your j ob cont rol j obs or Sequencer j obs. So you have t o m ake t he necessary changes
t o t hese Sequencers.
6 4 .D iffe r e n ce be t w e e n H a shfile a nd Se qu e n t ia l File ?
Hash file st ores t he dat a based on hash algorit hm and on a key value. A sequent ial
file is j ust a file wit h no key colum n. Hash file used as a reference for look up.
Sequent ial file cannot
6 6 .H ow ca n w e j oin on e Or a cle sou r ce a nd Se qu e n t ia l file ?.
Join and look up used t o j oin oracle and sequent ial file
6 8 .H ow ca n w e im ple m e n t Look u p in D a t a St a ge Se r ve r j obs?
I n server canvs we can perform 2 kinds of direct lookups
One is by using a hashed file and t he ot her is by using Dat abase/ ODBC st age as a
lookup.
6 9 .W h a t a r e a ll t h e t h ir d pa r t y t ools u se d in D a t a St a ge ?
Aut osys, TNG, event coordinat or are som e of t hem t hat I know and worked wit h
7 0 .w ha t is t h e diffe r e n ce be t w e e n r ou t ine a nd t r a nsfor m a nd fu n ct ion ?
Difference bet ween Rout iens and Transform er is t hat bot h are sam e t o pronounce
but Rout ines describes t he Business logic and Transform er specifies t hat t ransform
t he dat a from one locat ion t o anot her by applying t he changes by using
t ransform at ion rules .
7 1 .w ha t a r e t h e Job pa r a m e t e r s?
These Param et ers are used t o provide Adm inist rat ive access and change run t im e
values of t he j ob.EDI T> JOBPARAMETERS
I n t hat Param et ers Tab we can define t he nam e,prom pt ,t ype,value
7 2 .H ow ca n w e im pr ove t h e pe r for m a n ce of D a t a St a ge j obs?
Perform ance and t uning of DS j obs:
1.Est ablish Baselines
2.Avoid t he Use of only one flow for t uning/ perform ance t est ing
3.Work in increm ent
4.Evaluat e dat a skew
5.I solat e and solve
6.Dist ribut e file syst em s t o elim inat e bot t lenecks
7.Do not involve t he RDBMS in int ial t est ing
8.Underst and and evaluat e t he t uning knobs avail
7 3 .H ow ca n w e cr e a t e Con t a in e r s?
There are Two t ypes of cont ainers
1.Local Cont ainer
2.Shared Cont ainer
Local cont ainer is available for t hat part icular Job only.
Where as Shared Cont ainers can be used any where in t he proj ect .
Local cont ainer:
St ep1: Select t he st ages required
St ep2: Edit > Const ruct Cont ainer> Local
SharedCont ainer:
St ep1: Select t he st ages required
St ep2: Edit > Const ruct Cont ainer> Shared
Shared cont ainers are st ored in t he SharedCont ainers branch of t he Tree
St ruct ure
7 4 .W h e n sh ou ld w e u se OD S?
DWH's are t ypically read only, bat ch updat ed on a schedule
ODS's are m aint ained in m ore real t im e, t rickle fed const ant ly
7 5 .W h a t s diffe r e n ce be t w e e e n ope r a t iona l da t a st a ge ( OD S) & da t a
w a r e hou se ?
t hat which is volat ile is ODS and t he dat a which is nonvolat ile and hist orical and t im e
varient dat a is DWh dat a.in sim ple t erm s ods is dynam ic dat a.
A dat aware house is a decision support dat abase for organisat ional needs.I t is
subj ect orient ed,non volat ile,int egrat ed ,t im e varient collect of dat a.
ODS( Operat ional Dat a Source) is a int egrat ed collect ion of relat ed inform at ion . it
cont ains m axim um 90 days inform at ion.
7 6 .H ow t o h a ndle D a t e conve r t ion s in D a t a st a ge ? Conve r t a m m / dd/ yyyy
for m a t t o yyyy- dd- m m ?
We use a) "I conv" funct ion - I nt ernal Convert ion.
b) " Oconv" funct ion - Ext ernal Convert ion.
Funct ion t o convert m m / dd/ yyyy form at t o yyyy- dd- m m is
Oconv( I conv( Filednam e," D/ MDY[ 2,2,4] " ) ," D- MDY[ 2,2,4] " )
7 7 .H ow do you pa ss file n a m e a s t h e pa r a m e t e r for a j ob?
1. Go t o Dat aSt age Adm inist rat or- > Proj ect s- > Propert ies- > Environm ent > UserDefined. Here you can see a grid, where you can ent er your param et er nam e
and t he corresponding t he pat h of t he file.
2. Go t o t he st age Tab of t he j ob, select t he NLS t ab, click on t he " Use Job
Param et er" and select t he param et er nam e which you have given in t he above. The
select ed param et er nam e appears in t he t ext box beside t he " Use Job Param et er"
but t on. Copy t he param et er nam e from t he t ext box and use it in your j ob. Keep t he
proj ect default in t he t ext box
7 8 .H ow w ill you ca ll e x t e r n a l fu n ct ion or su br ou t ine fr om da t a st a ge ?
U can call ext ernal funct ions, subrout ines by using Before/ Aft er st age/ j ob
Subrout ines :
ExecSH
ExecDOS
or
By using Com m and St age Plug- I n or by calling t he rout ine from ext ernal com m and
act ivit y from Job Sequence.
7 9 .D im e n siona l m ode llin g is a ga in sub divide d in t o 2 t ype s.
a) St ar Schem a - Sim ple & Much Fast er. Denorm alized form .
b) Snowflake Schem a - Com plex wit h m ore Granularit y. More norm alized form .
8 0 .H ow do you e lim in a t e duplica t e r ow s?
Use Rem ove Duplicat e St age: I t t akes a single sort ed dat a set as input , rem oves all
duplicat e records, and writ es t he result s t o an out put dat a set ./ / / / / / / / /
I f you dont have rem ove duplicat es st ge, you can use hash file t o elim inat e
duplicat es.
8 1 .W h a t is D S Adm inist r a t or u se d for - did u u se it ?
The Adm inist rat or enables you t o set up Dat aSt age users, cont rol t he purging of t he
Reposit ory, and, if Nat ional Language Support ( NLS) is enabled, inst all and m anage
m aps and locales
I t is prim arily used t o creat e t he Dat ast age proj ect , assign t he user roles t o t he
proj ect , set param et ers of t he j obs at proj ect level. Assign t he users t o t he proj ect
can also be done here.
8 2 .W h a t is D S D e signe r u se d for - did u u se it ?
You use t he Designer t o build j obs by creat ing a visual design t hat m odels t he flow
and t ransform at ion of dat a from t he dat a source t hrough t o t he t arget warehouse.
The Designer graphical int erface let s you select st age icons, drop t hem ont o t he
Designer work area, and add links.
8 3 .W h a t a bou t Syst e m va r ia ble s?
Dat aSt age provides a set of variables cont aining useful syst em inform at ion t hat you
can access from a t ransform or rout ine. Syst em variables are read- only.
@DATE The int ernal dat e when t he program st art ed. See t he Dat e funct ion.
@DAY The day of t he m ont h ext ract ed from t he value in @DATE.
@FALSE The com piler replaces t he value wit h 0.
@FM A field m ark, Char( 254) .
@I M An it em m ark, Char( 255) .
@I NROWNUM I nput row count er. For use in const rains and derivat ions in
Transform er st ages.
@OUTROWNUM Out put row count er ( per link) . For use in derivat ions in Transform er
st ages.
@LOGNAME The user login nam e.
@MONTH The current ext ract ed from t he value in @DATE.
@NULL The null value.
@NULL.STR The int ernal represent at ion of t he null value, Char( 128) .
@PATH The pat hnam e of t he current Dat aSt age proj ect .
@SCHEMA The schem a nam e of t he current Dat aSt age proj ect .
@SM A subvalue m ark ( a delim it er used in UniVerse files) , Char( 252) .
@SYSTEM.RETURN.CODE
St at us codes ret urned by syst em processes or com m ands.
@TI ME The int ernal t im e when t he program st art ed. See t he Tim e funct ion.
@TM A t ext m ark ( a delim it er used in UniVerse files) , Char( 251) .
@TRUE The com piler replaces t he value
wit h 1.
@USERNO The user num ber.
@VM A value m ark ( a delim it er used in UniVerse files) , Char( 253) .
@WHO The nam e of t he current Dat aSt age proj ect direct ory.
@YEAR The current year ext ract ed from @DATE.
REJECTED Can be used in t he const raint expression of a Transform er st age of an
out put link. REJECTED is init ially TRUE, but is set t o FALSE whenever an out put link
is successfully writ t en.
8 4 .H ow do you e lim in a t e duplica t e r ow s?
delet e from from t able nam e where rowid not in( select m ax/ m in( rowid) from em p
group by colum n nam e) * * * * * * *
Dat a St age provides us wit h a st age Rem ove Duplicat es in Ent erprise edit ion. Using
t hat st age we can elim inat e t he duplicat es based on a key colum n
rem oval of duplicat es done in t wo ways:
1. Use " Duplicat e Dat a Rem oval" st age
or
2. use group by on all t he colum ns used in select , duplicat es will go away.
8 5 .W h a t a r e t ype s of H a sh e d File ?
Hashed File is classified broadly int o 2 t ypes.
a) St at ic - Sub divided int o 17 t ypes based on Prim ary Key Pat t ern.
b) Dynam ic - sub divided int o 2 t ypes
i) Generic
ii) Specific.
Default Hased file is " Dynam ic - Type Random 30 D"
8 6 .W h a t is D S M a na ge r u se d for - did u u se it ?
dat ast age m aneger is used t o export and im port purpose [ / B]
m ain use of export and im port is sharing t he j obs and proj ect s one proj ect t o ot her
proj ect .
8 7 .W h a t is D S D ir e ct or u se d for - did u u se it ?
dat ast age direct or is used t o run t he j obs and validat e t he j obs.
we can go t o dat ast age direct or from dat ast age designer it self.
8 8 .H ow do w e do t h e a u t om a t ion of dsj obs?
We can call Dat ast age Bat ch Job from Com m and prom pt using 'dsj ob'. We can also
pass all t he param et ers from com m and prom pt .
Then call t his shell script in any of t he m arket available schedulers.
The 2nd opt ion is schedule t hese j obs using Dat a St age direct or/ / / / / / * * * * *
" dsj obs" can be aut om at ed by using Shell script s in UNI X syst em
8 9 .H ow do you m e r ge t w o file s in D S?
Eit her used Copy com m and as a Before- j ob subrout ine if t he m et adat a of t he 2 files
are sam e or creat ed a j ob t o concat enat e t he 2 files int o one if t he m et adat a is
different .
9 0 .w ha t 's t h e diffe r e n ce be t w e e n D a t a st a ge D e ve lope r s a nd D a t a st a ge
D e sign e r s. W h a t a r e t h e sk ill
dat ast age developer is one how will code t he j obs.dat ast age designer is how will
desgn t he j ob, i m ean he will deal wit h blue print s and he will design t he j obs t he
st ages t hat are required in developing t he code
9 1 .I m por t a n ce of Sur r oga t e Ke y in D a t a w a r e hou sin g?
The concept of surrogat e com es int o play when t here is slowely changing dim ension
in a t able.
I n such condit ion t here is a need of a key by which we can ident ify t he changes
m ade in t he dim ensions.
These slowely changing dim ensions can be of t hree t ype nam ely SCD1,SCD2,SCD3.
These are sust em genereat ed key.Mainly t hey are j ust t he sequence of num bers or
can be alfanum eric values also.
//////
Surrogat e Key should be syst em generat ed num ber and it should be sm all int eger.
For each dim ension t able depending on t he SCD and no of t ot al records expect ed
over a 4 years t im e, you m ay lim it t he m ax num ber. This will im prove t he indexing,
perform ance, query processing.
9 4 .H ow m a n y pla ce s u ca n ca ll Rout in e s?
Four Places u can call
( i) Transform of rout ine
( A) Dat e Transform at ion
( B) Upst ring Transform at ion
( ii) Transform of t he Before & Aft er Subrout ines
( iii) XML t ransform at ion
( iv) Web base t ransform at ion
9 5 .W h a t is t h e Ba t ch Pr ogr a m a n d how ca n ge ne r a t e ?
Bat ch program e is t he program e it 's generat e run t im e t o m aint ain by t he dat ast age
it self but u can easy t o change own t he basis of your requirem ent ( Ext ract ion,
Transform at ion,Loading) .Bat ch program e are generat e depands your j ob nat ure
eit her sim ple j ob or sequencer j ob,You can see t his program e on j ob cont roll opt ion.
9 6 . Sce n a r io ba se d Qu e st ion ........... Suppose t ha t 4 j ob cont r ol by t he
se que n ce r lik e ( j ob 1 , j ob 2 , j ob 3 , j ob 4 ) if j ob 1 h a ve 1 0 ,0 0 0 r ow ,a ft e r r u n
t h e j ob on ly 5 0 0 0 da t a ha s be e n loa de d in t a r ge t t a ble r e m a ining a r e not
loa de d a nd you r j ob going t o be a bor t e d t h e n .. H ow ca n shor t ou t t he
pr oble m ?
Suppose j ob sequencer synchronies or cont rol 4 j ob but j ob 1 have problem , in t his
condit ion should go direct or and check it what t ype of problem showing eit her dat a
t ype problem , warning m assage, j ob fail or j ob abort ed, I f j ob fail m eans dat a t ype
problem or m issing colum n act ion .So u should go Run window - > Click- > Tracing> Perform ance or I n your t arget t able - > general - > act ion- > select t his opt ion here
t wo opt ion
( i) On Fail - - com m it , Cont inue
( ii) On Skip - - Com m it , Cont inue.
First u check how m any dat a already load aft er t hen select on skip opt ion t hen
cont inue and what rem aining posit ion dat a not loaded t hen select On Fail , Cont inue
...... Again Run t he j ob defiant ly u get successful m assage
9 7 .w a n t t o pr oce ss 3 file s in se qu e n t ia lly on e by on e , h ow ca n i do t ha t .
w h ile pr oce ssin g t h e file s it sh ou ld fe t ch file s a ut om a t ica lly .
I f t he m et adat a for all t he files r sam e t hen creat e a j ob having file nam e as
param et er, t hen use sam e j ob in rout ine and call t he j ob wit h different file nam e...or
u can creat e sequencer t o use t he j ob...
9 8 .W h a t H a ppe n s if RCP is disa ble ?
Runt im e colum n propagat ion ( RCP) : I f RCP is enabled for any j ob, and specifically for
t hose st age whose out put connect s t o t he shared cont ainer input , t hen m et a dat a
will be propagat ed at run t im e, so t here is no need t o m ap it at design t im e.
I f RCP is disabled for t he j ob, in such case OSH has t o perform I m port and export
every t im e when t he j ob runs and t he processing t im e j ob is also increased.
9 9 .de fa u lt s n ode s for da t a st a ge pa r a lle l Edit ion
Act ually t he Num ber of Nodes depend on t he num ber of processors in your syst em .I f
your syst em is support ing t wo processors we will get t wo nodes by default
“Time Tests You, Until You Taste The Time”