Last Names of Albanians Ahg12015
Last Names of Albanians Ahg12015
Last Names of Albanians Ahg12015
12015
Summary
In order to describe the isonymic structure of Albania, the distribution of 3,068,447 surnames was studied in the 12
prefectures and their administrative subdivisions: the 36 districts and 321 communes. The number of different surnames
found was 37,184. Effective surname number for the entire country was 1327, the average for prefectures was 653.3
84.3, for districts 365.9 42.0 and for communes 122.6 8.7. These values display a variation of inbreeding between
administrative levels in the Albanian population, which can be attributed to the previously published Prefecture effect.
Matrices of isonymic distances between units within administrative levels were tested for correlation with geographic
distances. The correlations were highest for prefectures (r = 0.71 0.06 for Euclidean distance) and lowest for communes
(r = 0.37 0.011 for Neis distance).
The multivariate analyses (Principal component analysis and Multidimensional Scaling) of prefectures identify three main
clusters, one toward the North, the second in Central Albania, and the third in the South. This pattern is consistent
with important subclusters from districts and communes, which point out that the country may have been colonised by
diffusion of groups in the North-South direction, and from Macedonia in the East, over a pre-existing Illiryan population.
Keywords: Albania, population structure, isonymy, inbreeding, isolation by distance
Introduction
Albania has a long and complex history. It was populated by
an Aryan people, the Illiryans, around 3000 BC. In historical
times, it was conquered by the Macedons of Phylip in 300
350 BC, coming under Greek power. Then, it became a
Roman province first under the Republic and then under
the Empire for about five centuries. After the split of the
Empire, it stayed under the rule of the Byzantines until the
15th century, when it became part of the Ottoman Empire.
When the Ottoman Empire dissolved in 1912, nationalism
arose in Albania, and the country gained independence in
Corresponding author: Chiara Scapoli, Department of Life Sciences and Biotechnology, University of Ferrara, Via L. Borsari 46,
I-44121 Ferrara, Italy. Tel: +39-0532-455744; Fax: +39-0532249761; E-mail: [email protected]
232
C
Surnames in Albania
investigate the Albanian population with the aim of detecting its structure through the isonymic methods as defined by
Crow and Mange (Crow & Mange, 1965) in the three administrative levels of the nation, namely: 12 prefectures, 36
districts and 321 communes. The data that were made available to us are the surnames of the electors of the 2009 general
elections database.
We report here how, in Albania, isonymic distance varies
with geography, as we observed in other European countries.
We obtained indications of the direction of migration, by
studying the geographic heterogeneity of surnames. For each
level, we studied the surname effective number, , and the
value of random inbreeding, FST .
We recall that surnames are a weak marker for inbreeding
and a strong marker for migration. Two Bianchi in Italy
may be more or less distantly related, as two White in
Britain, but one Bianchi in Britain or one White in Italy
are indicative of migration, as clearly as an immunofluorescent
cell in a negative field. With this proviso, our aim in this work
was the study of the present isonymic structure of Albania
resulting from surname drift and population movements in an
area about 320 km long and on average about 90 km wide,
bordering with the Adriatic sea, South of Montenegro and
Kosovo, West of Macedonia and North of Greece.
C
233
I. Mikerezi et al.
In the following subsections, we briefly touch on and recall the definitions of some of the statistics derived from the
surname distributions and their meaning in the study of microevolution in human groups (for an exhaustive review, see
Relethford, 1988).
pki pkj
E = 1
k
234
Random kinship
Random kinship IJ (x) between any two localities I and J at
distance x is given by
IJ (x) = K exp (Bx) (Malecot, 1955; Kimura, 1960)
where K is the average kinship at geographic distance x =
0, say average FST , and B is a function of average mutation
rate and of the variance of x. Then, IJ (x) is always positive
and is expected to decrease exponentially to 0 with increasing
distance. Random kinship was defined as
IJ (x) = IIJ (x)/4
(Barrai et al., 2012) with average FST as the average kinship
at distance x = 0.
C
Surnames in Albania
C
235
I. Mikerezi et al.
Table 1 Prefecture, district, number of surnames N, number of different surnames S, Fishers , Karlin-McGregor , isonymy I, and FST in
Albania. Districts grouped by prefecture.
Prefecture
District
Berat
Berat
Kucove
Skrapar
Diber
Diber
Mat
Bulqize
Durres
Durres
Kruje
Elbasan
Elbasan
Gramsh
Peqin
Librazhd
Fier
Fier
Lushnje
Mallakaster
Gjirokaster
Gjirokaster
Tepelene
Permet
Korce
Korce
Kolonje
Pogradec
Devoll
Kukes
Kukes
Has
Tropoje
Lezhe
Lezhe
Mirdite
Kurbin
Shkoder
Shkoder
Puke
Malesi madhe
Tirane
Tirane
Kavaje
Vlore
Sarande
Delvine
Vlore
236
FST
169,377
112,084
35,894
21,399
120,994
50,866
42,669
27,459
289,512
236,662
52,850
299,600
197,185
25,062
26,185
51,168
352,352
193,704
128,406
30,242
121,628
66,969
28,946
25,713
264,449
152,114
15,813
64,452
32,070
72,875
39,510
13,247
20,118
148,395
72,257
26,750
49,389
239,312
179,065
21,712
38,535
712,068
631,027
81,041
277,885
74,963
23,788
179,134
5276
4042
2123
1314
2482
1216
1296
915
9698
9149
1861
6555
5568
839
1103
1309
7479
5379
3691
998
4544
3150
1621
1539
7860
6250
1232
2497
1462
1844
1113
270
886
4080
2617
778
2192
7350
6642
892
1235
19,057
18,415
2743
7335
3534
1504
5327
496
420
277
273
377
298
247
191
775
757
337
457
442
168
103
186
623
510
345
147
910
767
273
460
1110
1108
453
378
211
351
190
84
198
173
133
67
298
658
637
123
260
997
1048
282
913
470
339
694
0.00293
0.00374
0.00767
0.01258
0.00312
0.00582
0.00575
0.00691
0.00268
0.0032
0.00633
0.00153
0.00224
0.00663
0.00396
0.00362
0.00177
0.00264
0.00268
0.00482
0.00744
0.01133
0.00934
0.01756
0.00419
0.00724
0.02783
0.00583
0.00653
0.0048
0.00479
0.00629
0.00973
0.00117
0.00184
0.00249
0.006
0.00275
0.00355
0.00562
0.00671
0.00141
0.00167
0.00347
0.00328
0.00623
0.01404
0.00386
0.00201
0.00238
0.0036
0.00366
0.00265
0.00335
0.00404
0.00521
0.00129
0.00132
0.00297
0.00219
0.00226
0.00595
0.00953
0.00536
0.00161
0.00196
0.0029
0.0068
0.0011
0.0013
0.00365
0.00217
0.0009
0.0009
0.00221
0.00264
0.00473
0.00284
0.00524
0.0118
0.00504
0.00576
0.0075
0.0148
0.00335
0.00152
0.00157
0.0081
0.00384
0.001
0.00095
0.00354
0.00109
0.00213
0.00295
0.00144
0.000505
0.000597
0.000907
0.000926
0.000664
0.000844
0.001017
0.001312
0.000323
0.000331
0.000746
0.000548
0.000566
0.001497
0.002392
0.001346
0.000402
0.000491
0.000726
0.001709
0.000277
0.00033
0.000922
0.000553
0.000226
0.000227
0.000567
0.000664
0.001191
0.000714
0.001317
0.00297
0.001272
0.001442
0.001879
0.003708
0.000842
0.000381
0.000394
0.002036
0.000965
0.000251
0.000239
0.000889
0.000275
0.000535
0.000747
0.000362
C
Surnames in Albania
Table 2 Comparison of isonymy parameters in nine European countries, in five South-American countries, in the United States and Texas,
and in Yakutia. Overall, 122 million surnames were analysed.
Country
Europe
Austria
Albania
Belgium
France
Germany
Holland
Italy
Switzerland1
Spain
Paternal
Maternal
Asia
Yakutia
North America
United States
Texas
South America
Argentina3
Venezuela2
Bolivia4
Paraguay3
Surnames
(S)
(average)
Isolation
by distance
140,766
37,184
137,442
495,104
462,526
126,485
215,623
166,116
854
123
997
1615
1596
787
1236
891
0.59
0.71
0.74
0.69
0.51
0.46
0.61
0.72
7.1
82
8
12.1
11.2
19
23.7
10.2
94,886
110,034
134
144
0.21
0.26
38
33
0.5
44,625
107
0.69
11.1
18
3.6
899,585
235,740
1366
734
0.24
0.42
20
15.3
22.6
3.9
23.2
4.8
414,441
68,665
174,922
39,047
422
122
122
108
0.47
0.78
0.5
0.42
54.5
56.8
144.6
122.9
Sample size
(SS, millions)
1
3.0
1.1
6
5.2
2.4
5.1
1.7
3.6
Type-token
(SS/S)
Cantons.
States.
3
Districts.
4
Provinces.
2
C
Isolation by distance
We studied isolation by distance through the correlation
of geographic with surname distances at the prefecture,
district and commune levels. We found that Euclidean,
Neis and Laskers distance between the 12 prefectures were
237
I. Mikerezi et al.
238
Kinship
We plotted kinship between communes as previously defined
as a function of geographic distance (Fig. 4). Note that at the
commune level several pairs of communes (33 per thousand)
did not share surnames.
The decrease of kinship with distance is significantly exponential, as predicted by Malecot (1955), (see also Kimura,
1960). Specifically, the exponential decay should be characteristic of structures more linear than Albania, for example, as
observed by us in Chile. However, there is considerable and
significant agreement between Malecot theory and kinship
decay in Albania. Then, the Malecot model is very strong
C
Surnames in Albania
Prefectures
The MDS projection on the first two dimensions of the matrix between prefectures (Fig. S6) differentiates a few clusters,
which correspond to groups of neighbouring prefectures. In
the resulting dendrogram (Fig. S7), a first large cluster composed mainly of the central prefectures is observed: Tirane,
Durres, Elbasan, Diber, Fier and Berat. These last two form
a subcluster within this cluster. Then, three prefectures in
the South-East and the extreme South, namely Korce , Vlore
and Gjirokaster, form the next cluster. Finally, two prefectures
of the North cluster together, Shkoder and Lezhe, whereas
Kukes represents an exception because, despite being a mountainous prefecture of the North, clusters together with the
Central prefectures, possibly due to the emigration from the
poorer areas toward the highly populated and richer areas
around the capital Tirana.
From the MDS projection in Figure S6, some other minor
but relevant points emerge, which complement the clustering
of prefectures. In particular, Tirane, Durres and Elbasan stand
alone at the centre of the bidimensional projection, removed
from the other prefectures. Vlore is marginal as is Korce .
Districts
The projection on the first two dimensions of the MDS tends
to differentiate several clusters, which correspond fairly well
to neighbouring districts (Fig. S8).
In the dendrogram (Fig. S9), the districts of Malesi e
Madhe, Tropoje and Has, at the Northern border with Montenegro, cluster with Fier, Mallakaster and Vlore, which are in
C
Communes
We found that, only at the commune level, there were 157
pairs of communes out of 51,360, which did not share surnames. Out of these 157 pairs, 49 included the commune of
Liqenas in Korce , which has a mainly Macedonian population. Also, 34 pairs included the commune of Lure in Diber,
but we did not find a good reason for this last preference.
Of course, there are various reasons why in Albania this absence of the same surname in small communes may occur.
We believe that, among others, one reason is to be found
in the complexity of the Albanian alphabet, which often results in the same name being written differently in different
communes. However, there is also some effect of distance on
the phenomenon. The average geographic distance between
the 157 pairs having infinite Laskers and Neis distance is
239
I. Mikerezi et al.
Figure 5 Projection of Laskers matrix of surname distances on districts in Albania by mapping (A) the first three
PCAs factors (I: Factor 1 = 42.8%; II: Factor 2 = 26.9%; III: Factor 3 = 11.5%) (B) the first three MDSs dimensions
(I: Dimension 1; II: Dimension 2; III: Dimension 3. Stress 11.2%).
128.9 14.7 km. The average distance for the other 51,203
pairs is 95.9 0.06 km, and the difference is significant
(t[oo] = 8.568, P 0.0001). We bypassed the problem posed
in the multivariate analysis of the distance matrices, by the
elements of infinite value, by substituting to the 157 infinite
isonymic distances, the nearest maximum observed. In this
way, we met no complexities in the subsequent analysis of the
distance matrices of Lasker and Nei. It is important to note
that if the 157 infinite distances are excluded, the correlations
for communes rise from 0.44 to 0.47 for Lasker, and from 037
240
C
Surnames in Albania
Conclusions
The methodology described in this paper was used to analyze
the isonymic structure of several South American countries
(Rodriguez-Larralde et al., 2000, 2011; Dipierri et al., 2005,
2011; Barrai et al., 2012). In these countries, 4 (Venezuela),
24 (Argentina), 23 (Bolivia), 4.5 (Paraguay) and 16.5 (Chile)
million surnames from the registers of electors were used.
In European countries and in the United States, we analysed surnames of telephone users (Barrai et al., 2001; Scapoli
et al., 2005, 2007; Rodriguez-Larralde et al., 2007). In thinly
populated Siberia, we used half a million surnames (Tarskaya
et al., 2009). The average value of for all the cities (or states,
in the case of Venezuela and the United States, or districts,
in the case of Argentina and Paraguay), and the isolation by
distance measured by the correlation between isonymic and
geographic distances, are given in Table 2 for the countries
studied up to now. Several features emerge from the comparisons reported in Table 2. First, the general similarity among
European nations in profusion of surnames as measured by ,
and for isolation by distance, as measured by the linear correlation. Secondly, the relatively small value of in Venezuela,
Bolivia, Paraguay, Spain, Chile and now Albania; and thirdly,
the practical absence of isolation by distance in the United
States, excluding bilingual Texas (Rodriguez-Larralde et al.,
2007). In Albania, the average number of persons having the
same surname (measured by the ratio Sample Size/Surnames,
given as the index SS/S in Table 2, is more similar (82) to
that of Argentina, Bolivia and Venezuela than to that of other
European countries. It may be of some interest to compare
our Table 2 with King and Joblings (2009) table 1. There,
they give the mean number of carriers per surname in 5538
households in 27 countries. Where applicable, their results are
consistent with ours.
241
I. Mikerezi et al.
Acknowledgements
The authors are grateful to the CEC of Albania who conceded
the data. The authors are also particularly grateful to both
Referees who gave valuable advice. The work was supported
by grants of the University of Ferrara to Chiara Scapoli.
References
Adamic, L. A. & Huberman, B. A. (2002) Zipf law and the Internet.
Glottometrics 3, 143150.
242
Barrai, I., Scapoli, C., Beretta, M., Nesti, C., Mamolini, E. &
Rodriguez-Larralde, A. (1996) Isonymy and the genetic structure of Switzerland. I: The distributions of surnames. Ann Hum
Biol 23, 431455.
Barrai, I., Rodriguez-Larralde, A., Mamolini, E. & Scapoli, C.
(2000) Elements of the surname structure of Austria. Ann Hum
Biol 26, 115.
Barrai, I., Rodriguez-Larralde, A., Mamolini, E., Manni, F. &
Scapoli, C. (2001) Elements of the surname structure of the USA.
Am J Phys Anthropol 114, 109123.
Barrai, I., Rodriguez-Larralde, A., Dipierri, J., Alfaro, E., Acevedo,
N., Mamolini, E., Sandri, M., Carrieri, A. & Scapoli, C.
(2012) Surnames in Chile. A study of the population of Chile
through isonymy. Am J Phys Anthropol 147, 380388. doi:
10.1002/ajpa.22000.
Bidollari, C
. (2010) Onomastic investigations. In Albanian. Tirane:
Botimet Kumi Editor.
Cavalli-Sforza, L. L. & Edwards, A. W. F. (1967) Phylogenetic analysis
models and estimation procedures. Am J Hum Genet 19, 233257.
Chesire, J. A. & Longley, P. A. (2012) Identifying spatial concentrations of surnames. Int J Geogr Inform Sci 26, 309325.
Crow, J. F. & Mange, A. (1965) Measurements of inbreeding from
the frequency of marriages between persons of the same surname.
Eugen Q 12, 199203.
Dipierri, J. E., Alfaro, E., Scapoli, C., Mamolini, E., RodriguezLarralde, A. & Barrai, I. (2005) Surnames in Argentina. A population study through isonymy. Am J Phys Anthropol 128, 199209.
Dipierri, J. E., Rodriguez-Larralde, A., Alfaro, E. L., Scapoli, C.,
Mamolini, E., Salvatorelli, G., De Lorenzi, S., Sandri, M., Carrieri, A. & Barrai, I. (2011) Surnames in Paraguay: A study of
the population of Paraguay through isonymy. Ann Hum Genet 75,
678687. doi: 10.1111/j.1469-1809.2011.00676.x.
Fox, W. R. & Lasker, G. W. (1983) The distribution of surname
frequencies. Int Stat Rev 51, 8187.
Kimura, M. (1960) Outline of population genetics (in Japanese). Tokyo:
Baifukan.
King, T. E. & Jobling, M. A. (2009) Whats in a name? Y chromosomes, surnames and the genetic genealogy revolution. Trends
Genet 25(8), 351360.
Lasker, G. W. (1985) Surnames and genetic structure. Cambridge: Cambridge University Press.
Longley, P. A., Chesire, J. A. & Mateos, P. (2011) Creating a regional
geography of Britain through the spatial analysis of surnames.
Geoforum 42, 506516.
Malecot, G. (1955) Decrease of relationship with distance. Cold
Spring Harbour Symp 20, 5253.
Mantel, N. (1967) The detection of disease clustering and a generalized regression approach. Cancer Res 27, 209220.
Mateos, P., Longley, P. A. & OSullivan, D. (2011) Ethnicity and
population structure in personal naming networks. PloS ONE 6,
e22943. doi:10.1371/journal.pone.0022943.
Menozzi, P., Piazza, A. & Cavalli-Sforza, L. L. (1978) Synthetic
maps of human gene frequencies in Europeans. Science 201, 786
792.
Mikerezi, I., Susanne, C., Bajrami, Z. & Kume, K. (1995) Differentiation of Albanian human populations and their relationships with
Balkanic ethnic groups according to gene frequencies at ABO,
MN and Rhesus loci. IUAES International Congress, April 2021,
1995, Torino, Italia, p. 32.
Mikerezi, I., Pizzetti, P., Lucchetti, E. & Ekonomi, M. (2003)
Isonymy and the genetic structure of Albanian population. Coll
Antropol 27, 507514.
C
Surnames in Albania
C
Supporting Information
Additional supporting information may be found in the online
version of this article:
Table S1 Distribution of isonymy parameters.
Table S2 The 100 most frequent surnames in Albania.
Table S3 The most frequent names of Arabic origin in
Albania.
Table S4 Surnames with the prefix Papa of clear Greek
origin.
Figure S1 Variation of the number of occurrences in 3 million surnames in Albania.
Figure S2 Variation of Laskers distance between 36 districts
in Albania.
Figure S3 Variation of Laskers distance between 321 communes in Albania.
Figure S4 Variation of Euclidean with geographic distance.
Figure S5 Variation of Neis with geographic distance.
Figure S6 MDS on the matrix of Laskers distances between
Prefectures.
Figure S7 Dendrogram of Albania prefectures.
Figure S8 MDS of Laskers distance matrix between districts.
Figure S9 Dendrogram of districts from the matrix of Laskers
distance.
Figure S10 Projection of the 321 communes of Albania on
the first two dimensions of the matrix of Laskers distances.
Figure S11 Dendrogram of communes.
As a service to our authors and readers, this journal provides
supporting information supplied by the authors. Such materials are peer-reviewed and may be re-organised for online
delivery, but are not copy-edited or typeset. Technical support issues arising from supporting information (other than
missing files) should be addressed to the authors.
Received: 9 August 2012
Accepted: 18 November 2012
243