. 8
( 10)


(r = 0.94, p < 0.0001), suggesting that the information regarding skeletal-part fre-
quencies provided by MNE is virtually identical to that provided by NISP.
The close relationship between NISP and MNE is widespread. Enloe et al. (2000)
described a large sample of saiga antelope (Saiga tatarica) remains from Prolom II
Cave, located on the Crimean Peninsula in the Ukraine. The material dates to the
Mousterian cultural period (Table 6.15). NISP and MNE are strongly correlated
(r = 0.92, p < 0.0001); the MNE values are redundant with the NISP values with
figure 6.18. Relationship between NISP and MNE values for size class II cervids and
bovids at Kobeh Cave, Iran. Best-¬t regression line (Y = ’0.015X0.692 ; r = 0.945) is signi¬-
cant (p = 0.0001). Data from Table 6.14.

Table 6.15. NISP and MNE frequencies of skeletal parts
of saiga antelope (Saiga tatarica) from Prolom II Cave,
Ukraine. Data from Enloe et al. (2000)

Skeletal part/portion NISP MNE
Maxilla 319 81
Mandible 477 56
Sacrum 3 3
Scapula 13 12
Humerus 33 22
Radius 112 46
Carpal 156 48
Metacarpal 138 82
Femur 22 16
Patella 4 4
Tibia 33 26
Lateral malleolus 9 9
Astragalus 82 81
Calcaneum 83 55
Naviculo cuboid 36 36
Cuneiform 18 17
Metatarsal 62 37
First phalanx 285 253
Second phalanx 114 109
Third phalanx 81 72
quantitative paleozoology

figure 6.19. Relationship between NISP and MNE values for saiga antelope at Prolom II
Cave, Ukraine. Best-¬t regression line (Y = 0.27X0.733 ; r = 0.922) is signi¬cant (p = 0.0001).
Data from Table 6.15.

respect to estimating the frequencies of skeletal parts (Figure 6.19). This is a com-
mon pattern. Table 6.16 summarizes the statistical relationships between NISP and
MNE for twenty-nine samples of faunal remains. In all but one case, r > 0.7, and
p < 0.0001 ; in twenty-¬ve of twenty-nine cases, r > 0.8. The value of MNE as a quan-
titative unit, even though it is explicitly designed to tally frequencies of skeletal parts,
seems redundant with NISP; MNE values for an assemblage can often be closely
predicted from the NISP values for that assemblage. In fact, Broughton et al. (2006)
found the correlation between the two variables to be perfect (r = 1.0, p < 0.001) in
a collection of ¬sh remains accumulated and deposited by an owl.
Earlier in this chapter it was argued that MNE is at best ordinal scale. This is
easily shown if we consider the relationship of MNE per skeletal part (or por-
tion) to MNI per skeletal part (or portion). Data revealing this relationship are,
like those revealing the relationship between NISP and MNE, also relatively com-
mon. We need only graph one data set to illustrate the relationship. Marshall and
Pilgram™s (1991 ; Pilgram and Marshall 1995) NISP and MNI data for caprine (Ovis
skeletal completeness, skeletal parts, and fragmentation 259

Table 6.16. Relationship between NISP and MNE in twenty-nine assemblages

Assemblage Relationship r p Reference
= 0.126X0.807
Meier deer Y 0.837 0.0001 This volume
= 0.063X0.766
Meier wapiti Y 0.883 0.0001 This volume
= “0.015X0.692
Kobeh Cave Y 0.945 0.0001 Marean and Kim (1998)
= 0.27X.0733
Prolom II Cave Y 0.922 0.0001 Enloe et al. (2000)
= 0.192X0.825
Garnsey (bison) Y 0.942 0.0001 Speth (1983)
= 0.162X0.738
Sjovold (bison) Y 0.939 0.0001 Dyck and Morlan (1995)
= 0.012X0.702
Twilight Cave BS1 “ size II Y 0.898 0.0001 Marean (1992)
= 0.029X0.786
Twilight Cave RBL2.1 “ size I Y 0.956 0.0001 Marean (1992)
= 0.081 X0.787
Twilight Cave RBL2.1 “ size II Y 0.950 0.0001 Marean (1992)
= 0.056X0.79
Twilight Cave RBL2.2 “ size I Y 0.948 0.0001 Marean (1992)
= 0.063X0.76
Twilight Cave RBL2.2 “ size II Y 0.937 0.0001 Marean (1992)
= 0.067X0.754
Twilight Cave RBL2.3 “ size I Y 0.914 0.0001 Marean (1992)
= 0.041 X0.739
Twilight Cave RBL2.3 “ size II Y 0.923 0.0001 Marean (1992)
= 0.088X0.772
Twilight Cave DBS “ size I Y 0.945 0.0001 Marean (1992)
= 0.108X0.741
Twilight Cave DBS “ size II Y 0.932 0.0001 Marean (1992)
= 0.036X0.932
Friesenhahn Cave (Homotherium) Y 0.957 0.0001 Marean and Ehrhardt (1995)
= “0.010X0.913
Friesenhahn Cave (proboscidean) Y 0.946 0.0001 Marean and Ehrhardt (1995)
= 0.001 X0.756
Kua Base Camp-size I Y 0.792 0.0001 Bartram and Marean (1999)
= 0.095X0.598
Kua Base Camp-size III Y 0.725 0.0001 Bartram and Marean (1999)
= 0.193X0.435
Kua Scavenged Kill-size III Y 0.661 0.0001 Bartram and Marean (1999)
= 0.02X0.519
Die Kelders-L. 10, size 1 Y 0.776 0.0001 Marean et al. (2000)
= “0.084X0.577
Die Kelders-L. 10, size 2 Y 0.789 0.0001 Marean et al. (2000)
= “0.173X0.592
Die Kelders-L. 10, size 3 Y 0.732 0.0001 Marean et al. (2000)
= “0.075X0.542
Die Kelders-L. 10, size 4 Y 0.835 0.0001 Marean et al. (2000)
= “0.914X1.256
Nahal Hadera V Y 0.911 0.0001 Munro and Bar-Oz (2005)
= “0.921 X1.292
Hefzibah Y 0.913 0.0001 Munro and Bar-Oz (2005)
= 0.046X0.763
Hayonim Cave-Early Natu¬an Y 0.815 0.0001 Munro and Bar-Oz (2005)
= “0.004X0.781
Hayonim Cave-Late Natu¬an Y 0.848 0.0001 Munro and Bar-Oz (2005)
= “0.358X1.008
el-Wad Terrace Y 0.823 0.0001 Munro and Bar-Oz (2005)

and Capra) remains from Ngamuriak, a Neolithic pastoral site in Kenya (Table 6.17),
are strongly related (Figure 6.20). Other assemblages from other places and times
display the same relationship between NISP and MNI per skeletal part or portion
as the Ngamuriak collection. Of the twenty-two assemblages listed in Table 6.18, the
relationship between NISP and MNI per skeletal part is rather strong (r > 0.7) in
twenty assemblages, and it is quite strong in ¬fteen assemblages (r > 0.8). Again, this
should come as no surprise given previous discussions and analyses presented in this
Table 6.17. NISP and MNI frequencies of skeletal parts of
caprines (Ovis and Capra) from Ngamuriak, Kenya. P,
proximal; D, distal. Data from Pilgram and Marshall (1995)

Skeletal part/portion NISP MNI
Tooth rows, lower 288 54
Innominate 133 28
Scapula 94 32
P humerus 30 11
D humerus 91 31
P radius 81 23
D radius 35 16
P metacarpal 50.5 12
D metacarpal 32.5 2
P femur 54 24
D femur 47 12
P tibia 18 6
D tibia 32 12
Calcaneum 65 17
P metatarsal 45.5 12
D metatarsal 29.5 1
First phalanx 83 4
Second phalanx 30 2

figure 6.20. Relationship between NISP and MNE values for caprine remains from
Neolithic pastoral site of Ngamuriak, Kenya. Best-¬t regression line (Y = “0.469X0.904 ;
r = 0.666) is signi¬cant (p = 0.002). Data from Table 6.17.
skeletal completeness, skeletal parts, and fragmentation 261

Table 6.18. Relationship between NISP and MNI per skeletal part or portion in twenty-two

Assemblage Relationship r p Reference
= ’0.469X0.904
Ngamuriak Y 0.666 0.0026 Pilgram and Marshall (1995)
= ’0.083X0.762
Gatecliff Shelter Y 0.627 0.0001 Thomas and Mayer (1983)
= ’0.018X0.587
Boomplaas-Size I Y 0.769 0.0001 Klein and Cruz-Uribe (1984
= ’0.051 X0.53
Boomplaas-Size IV Y 0.888 0.0001 Klein and Cruz-Uribe (1984)
= ’0.034X0.492
El Juyo Red Deer-L. 4 Y 0.768 0.0001 Klein and Cruz-Uribe (1984)
= ’0.062X0.60
El Juyo Red Deer-L. 6 Y 0.701 0.0001 Klein and Cruz-Uribe (1984)
= ’0.083X0.705
Equus Cave-Size IV Y 0.888 0.0001 Klein and Cruz-Uribe (1984)
= ’0.097X0.70
Elandsfontein“Size I Y 0.874 0.0001 Klein and Cruz-Uribe (1991 )
= ’0.168X0.799
Elandsfontein“Size II Y 0.876 0.0001 Klein and Cruz-Uribe (1991 )
= ’0.395X0.977
Elandsfontein“Size III Y 0.905 0.0001 Klein and Cruz-Uribe (1991 )
= ’0.242X0.882
Elandsfontein“Size III Y 0.902 0.0001 Klein and Cruz-Uribe (1991 )
= ’0.189X0.84
Elandsfontein“Size V Y 0.855 0.0001 Klein and Cruz-Uribe (1991 )
= ’0.017X0.728
Klasies River Mouth“Size I Y 0.860 0.0001 Klein (1989)
= ’0.073X0.749
Klasies River Mouth“Size II Y 0.827 0.0001 Klein (1989)
= ’0.065X0.633
Klasies River Mouth“Size III Y 0.823 0.0001 Klein (1989)
= ’0.014X0.684
Klasies River Mouth“Size IV Y 0.748 0.0001 Klein (1989)
= ’0.017X0.582
Klasies River Mouth“Size V Y 0.779 0.0001 Klein (1989)
= ’0.102X0.877
El Castillo Cave“Mag & Sol Y 0.958 0.0001 Klein and Cruz-Uribe (1994)
= ’0.018X0.85
39FA82 Y 0.988 0.0001 White (1952)
= ’0.062X0.842
Bull Pasture bison Y 0.925 0.0001 White (1955)
= ’0.074X0.884
Bull Pasture wapiti Y 0.966 0.0001 White (1955)
= ’0.152X0.901
Buffalo Pasture Y 0.883 0.0001 White (1956)


This chapter has concerned the MNE quantitative unit (and various units derived
from MNE) and properties it is thought to measure (e.g., skeletal-part abundances,
skeletal completeness, fragmentation). As Ringrose (1993:129) pointed out, MNE is a
quantitative unit “speci¬cally designed for the study of skeletal-part representation,
rather than taxonomic abundance.” This does not mean that MNE and MNI (or
NISP) are not mechanically or statistically related. MNI values are by de¬nition
(Table 2.4) based on the maximum MNE. Thus, White™s (Table 6.5) summed left
and right MNE values are strongly correlated with the MNI values for each of those
fourteen skeletal elements (Figure 6.21 ). This is because MNI per skeletal element
is merely the greater of the tally of left elements or the tally of right elements. MNI
quantitative paleozoology

(lefts + rights) and MNI per skeletal part. Diagonal
figure 6.21. Relationship between
shown for reference. Data from Table 6.5.

is a tally of redundant skeletal parts or portions, traditionally based on the greatest
MNE in a collection.
MNE seems, on the surface, to be a valuable quantitative unit. It may in fact be
valuable if it is clear that, say, femora are much more intensively and extensively
fragmented than are humeri. The quick way to determine this is to calculate the
relationship between NISP per skeletal part and the MNE per skeletal part. If the
two values are strongly correlated, there is little reason statistically to use MNE in
further analyses, such as determining if femora are more abundant than humeri;
NISP will provide the same ordinal scale information as MNE. MNE is a valuable
unit for measuring the intensity of fragmentation, de¬ned as the NISPi:MNEi ratio,
where i is a particular skeletal part. Based on analyses and arguments presented in this
chapter and elsewhere (Grayson and Frey 2004), MNE is not useful for measuring
skeletal-part frequencies. This is so because it is derived (de¬nition dependent), it is
in¬‚uenced by sample size (NISP), and it is in¬‚uenced by aggregation.
MNE has undergone a history similar to that of MNI. Both units were used to mea-
sure the value of a variable, then various potential problems with them were identi¬ed
skeletal completeness, skeletal parts, and fragmentation 263

and efforts were made to resolve those problems, for example, tallying units more
carefully and more consistently taking into account numerous factors (age/sex/size
differences; fragmentation differences and anatomical overlap). After various sorts
of potential additional steps to tallying specimens into MNI or MNE units were
identi¬ed and implemented, it was pointed out that perhaps the quantitative unit is
not salvageable despite various safeguards. The quantitative unit is not salvageable
because it is in fact a derived measure and it is at best ordinal scale; it is redundant
with NISP, a fundamental measure. As with MNI, it seems we have reached the point
with MNE where it may no longer be worth using it to the same degree that it once
was, particularly with respect to measuring skeletal-part frequencies.
There are other quantitative units similar to MNE. These include the minimum
number of butchering units (Lyman 1979; Schulz and Gust 1983), and the minimum
number of analytically speci¬ed anatomical regions (Stiner 1991, 2002). It is beyond
the scope of this discussion to explore the properties of these units, but it is logical to
suspect that they, too, will often be strongly correlated with NISP or sample size, and
heavily in¬‚uenced by aggregation and how they are de¬ned. This suspicion is based
on the fact that both the minimum number of butchering units and the minimum
number of anatomical regions are determined in the same manner as MNE and
MNI. The only difference is that the minimum number of butchering units and the
minimum number of anatomical regions are at skeletal scales of inclusiveness between
MNE and MNI as typically de¬ned. The most important thing to remember is that
MNE and similar units are often signi¬cantly in¬‚uenced by sample size, aggregation,
and de¬nition, just as is MNI. This simple fact suggests that NISP is to be preferred
over MNE and similar units, especially when MNE provides abundance information
that is redundant with NISP.
Tallying for Taphonomy: Weathering,
Burning, Corrosion, and Butchering

Taphonomy is a term coined by Russian paleontologist I. A. Efremov (1940) from
the Greek words taphos (burial) and nomos (law). Efremov meant for taphonomy to
specify the transition, in all details, of organics from the biosphere to the lithosphere.
In the context of this book (recall Figure 2.1 ), taphonomy concerns the agents and
process(es) that in¬‚uence an animal carcass from the moment of that animal™s death
until its remains (if any survive the vicissitudes of time) are recovered by the paleo-
zoologist, and also the kind and magnitude of those in¬‚uences. There are a plethora
of taphonomic agents and processes that variously disarticulate, disperse, alter, and
destroy carcass tissues, including bones and teeth (Lyman 1994c).
In this chapter, techniques for tallying what are sometimes referred to as tapho-
nomic signatures, features, or attributes evident on faunal remains are introduced.
Identifying the taphonomic agents and processes that in¬‚uenced an assemblage of
faunal remains assists interpretation of the remains. (If the agent is biological, then
the taphonomic feature is a trace fossil [Gautier 1993; Kowalewski 2002].) Do, for
example, those remains re¬‚ect what human hunters ate or do they represent a ¬‚u-
vially winnowed set of skeletons of animals that died during a seasonal crossing of
a river at ¬‚ood stage? Determination of the taphonomic history of a collection of
faunal remains may reveal aspects of paleoecology not otherwise evident among the
collection of remains, such as evidence of carnivore gnawing on ungulate bones when
no carnivore remains are recovered.
A taphonomic signature is a modi¬cation feature evident on a skeletal part that is
known (or believed) to have been created by only one process or agent (Blumenschine
et al. 1996; Fisher 1995; Gifford-Gonzalez 1991; Marean 1995). It is a signature because
it is unique to that agent or process. A taphonomic feature need not be a signature;
it is an artifact or epiphenomenon of an agent or process that modi¬ed a skeletal
specimen™s location, anatomical completeness, or appearance. Given the model of
an unmodi¬ed skeletal part as it would appear in a normal organism walking, ¬‚ying,
tallying for taphonomy 265

or swimming around the landscape, any perimortem or postmortem modi¬cation
to that skeletal part that was not created by physiological processes of the organ-
ism (e.g., a healed fracture [antemortem]) is a taphonomic feature. Instances of the
occurrence of that modi¬cation may be recorded during study of the remains because
by de¬nition such an attribute is not a normal feature of a bone or tooth. A modi-
¬cation feature need not have a speci¬cally identi¬able cause or creation agent, and
in fact many features do not, though that number is decreasing as our knowledge
of causal agents increases through actualistic research (e.g., Dom´nguez-Rodrigo
and Barba 2006; Kowalewski 2002). A taphonomic feature is created perimortem (at
death) or postmortem (after death). Tallying up, say, the frequency of specimens with
gnawing marks, or the frequency of gnawing marks, or both comprises tallying for
Gnawing marks created by hungry carnivores, butchering marks created by hun-
gry hominids, burning damage created by fuel-hungry ¬‚ames, and various other
such taphonomic features can be tallied in various ways to decipher the taphonomic
history of a collection of animal remains. Intuitively, for instance, given two collec-
tions of bones that are otherwise quite similar (in terms of taxonomic abundances,
however measured, and in terms of frequencies of skeletal parts), the one with more
bones displaying carnivore gnawing damage is likely the one that underwent relatively
more carnivore-gnawing related attrition (consumption) of bone tissue. Tallying such
attributes may seem straightforward, but even if tallying is sometimes easy to do, it
is not always easy to understand or interpret the tallies. What a tally signi¬es may
well be obscure because a tally of taphonomic attributes (measured variable) may
have an unknown relationship to a particular taphonomic (target variable) agent
or process. How, for example, does the frequency of gnawed bones (measured vari-
able A) or the frequency of gnawing marks (measured variable B) relate to gnawing
intensity (target variable)? Many attributes are thus not signatures but are tallied in
hope that the quantitative data will reveal aspects of the taphonomic history of the
This chapter begins with some rather easily tallied taphonomic attributes that
many taphonomists and paleozoologists believe have well-understood relationships
to taphonomic agents and processes. The discussion progresses to complex attributes
for which little consensus exists as to how they should be tallied and/or what a tally
might mean with respect to a taphonomic agent or process. The goal of this chapter
is not to solve particular substantive problems, but rather to describe quantita-
tive units, illustrate how they are tallied, and exemplify how they might be ana-
lyzed. And given the topic of this chapter, another quantitative unit must ¬rst be
quantitative paleozoology


There are two units not often mentioned in the quantitative paleozoology literature
that need to be identi¬ed. One unit previously mentioned in this book is the number
of specimens (NSP). NSP is the number of all specimens in an assemblage or collec-
tion (however de¬ned), including those that are identi¬able to taxon and those that
are not identi¬able. NSP is a fundamental measure just like NISP. NSP has been used
by name by several zooarchaeologists (e.g., Grayson 1991a; Stiner 2005). Another
unit, not previously identi¬ed in this book, used by fewer individuals is what Stiner
(2005:235) uniquely refers to as the number of unidenti¬ed specimens, or NUSP. For
any collection of faunal remains, NSP = NISP + NUSP, and NISP = NSP “ NUSP.
Why should NSP and NUSP be of concern? First, and less importantly, they need
to be mentioned because sometimes paleozoologists will refer to the ratio NISP/NSP.
The implication of the ratio is seldom stated explicitly, but it seems to be thought
that the higher the ratio (the greater the proportional value of NISP), the more
specimens were identi¬ed because they were not so badly preserved (corroded, frag-
mented) as to be unidenti¬able. The NISP/NSP ratio is thus thought to re¬‚ect general
aspects of the taphonomic history of a collection (e.g., fragmentation and destruc-
tion extent and intensity). Perhaps because the relationship of the NISP/NSP ratio
to preservational condition has never been empirically or critically examined, the
NISP/NSP ratio is seldom used analytically. Or, perhaps the ratio is seldom analyzed
because it could be a function of which skeletal parts are represented as some parts
are more easily identi¬ed than others. Whatever the case, if mentioned at all, the
ratio is often mentioned in a descriptive role. After all, it is simple to calculate and it
is based on two directly measured variables “ NISP and NSP “ that are readily deter-
mined. (NSP and NUSP are in¬‚uenced by fragmentation, though this is not generally
The second reason to mention NSP and NUSP is important and concerns the
fact that many of the features tallied for taphonomic purposes can be tallied for the
NISP of a collection, or for the NSP (= NISP + NUSP). Here taphonomic features are
discussed as if they are only tallied using NISP because that is the traditional manner in
which they are tallied. Distinguishing NISP and NUSP, and tallying the taphonomic
features using both, may be worth considering if, and this is important, there is
reason to believe that the taphonomic process or agent might be re¬‚ected differently
across identi¬ed specimens than it is across unidenti¬ed specimens. If there is no
reason to believe that this is the case, then if the sample of NISP is suf¬ciently large
(and we can ascertain that by sampling to redundancy [Chapter 4]), then tallying
taphonomic features across NISP will likely be suf¬cient to measure taphonomic
tallying for taphonomy 267

Table 7.1. Weathering stages as de¬ned by Behrensmeyer (1978)

Stage De¬nition
0 Greasy, no cracking or ¬‚aking, may have soft tissue attached.
1 Longitudinal cracking; articular surfaces with mosaic cracking; split lines
beginning to form.
2 Flaking of outer surface (exfoliation); cracks present; crack edges are angular.
3 Compact bone has rough, ¬brous texture; weathering penetrates 1 “1.5 mm;
cracked edges are rounded.
4 Coarsely ¬brous and rough surface; loose splinters present; weathering
penetrates to inner cavities; cracks are open.
5 Bone tissue very fragile and falling apart; large splinters present.

variables and the degree or extent of the in¬‚uence of the taphonomic processes and
agents of concern.


In a classic paper, Behrensmeyer (1978) speci¬ed six stages through which a (mam-
mal) bone would pass during subaerial weathering (Table 7.1 ). She de¬ned weath-
ering as “the process by which the original microscopic organic and inorganic
components of bone are separated from each other and destroyed by physical and
chemical agents operating on the bone in situ, either on the [Earth™s] surface or within
[sediments]” (Behrensmeyer 1978:153). Weathering involves the natural decomposi-
tion and destruction of bone tissue, and Behrensmeyer recorded subaerial weather-
ing only “ weathering that occurs “on the [ground] surface.” To quantify weathering
damage, Behrensmeyer (1978:152) suggested that the analyst tally the number of bone
specimens that display each weathering stage. The weathering stage that a specimen
displays is, in turn, recorded as the maximum weathering stage evident on an area
comprising at least 1 cm2 of the surface of a specimen.
The maximum weathering displayed is used because the target variable of interest
concerns not how weathered (or unweathered) a specimen is, but rather the duration
of “surface exposure of a bone prior to burial and the time period over which bones
accumulated” (Behrensmeyer 1978:161). This means that the maximum weathering
stage evident is recorded rather than minimum or average weathering evident on a
specimen for two reasons. First, several variables mediate (slow) the rate of weathering
(Lyman and Fox 1989), thereby potentially weakening any statistical relationship
between weathering stage (the dependent variable) and the variable of analytical
quantitative paleozoology

Table 7.2. Weathering stage data for two collections of
mammal remains from Olduvai Gorge. Frequencies are
NISP (% of total NISP). Data from Potts (1986)

Stage FLK “Zinj” FLKNN L/2
0 771 (76) 105 (46)
1 147 (14) 59 (26)
2 63 (6) 24 (10)
3 36 (4) 39 (17)
4 0 (0) 2 (1)
5 1 (0) 1 (0)

interest (e.g., duration of exposure). The second reason that maximum weathering
is used is that there is a strong statistical relationship between date of death of the
animal contributing the maximally weathered bone and the maximum weathering
stage displayed by one or more bones of the carcass (Behrensmeyer 1978). The critical
interpretive issue, then, requires understanding the relationship between maximum
weathering stage displayed and the target variable of interest whether it be duration
of exposure, date of animal death, or something else.
Quantifying bone weathering is relatively straightforward. Count up how many
specimens in a collection display each of the six weathering stages. Then, present the
tallies in a table as absolute counts of specimens per weathering stage, as proportions
or percentages of specimens per weathering stage, or both. Data can also be presented
graphically. An example is provided by Potts™s (1986) data for assemblages of mammal
remains from Plio-Pleistocene archaeological sites in Olduvai Gorge, Tanzania. Data
for two assemblages are summarized in Table 7.2, and percentage frequency data for
the two assemblages are graphed in Figure 7.1 in what Lyman and Fox (1989:300) term
a weathering pro¬le, de¬ned as “the percentage frequencies of bone specimens in an
assemblage displaying each weathering stage.” Percentage frequency data eliminate
the effects of variation in sample size, thereby permitting differences between the
two assemblages plotted in the graph in Figure 7.1 to be interpreted in terms of
differences in weathering rather than in terms of difference in sample size. The fact
that one assemblage is nearly four and a half times larger than the other (Table 7.2)
is not apparent in Figure 7.1 .
Note that thus far the weathering data have been presented in tabular form
(Table 7.2) and in graphic form (Figure 7.1 ). How might those data be analyzed
further? χ 2 analysis indicates that specimens are not equally distributed within the
weathering stages across the two assemblages (χ 2 = 109.74, p < 0.0001). Analysis
tallying for taphonomy 269

Table 7.3. Expected frequencies (EXP) of specimens per weathering stage
(WS) in two collections (Zinj; L/2), adjusted residuals (AR), and
probability values (p) for each. Based on data in Table 7.2

WS Zinj EXP L/2 EXP Zinj AR L/2 AR Zinj p L/2 p
0 714.6 161.4 9.09 <0.01 <0.01
1 168.0 38.0 4.11 <0.01 <0.01
2 71.0 16.0 2.30 <0.05 <0.05
3 61.2 13.8 7.77 <0.01 <0.01
4 1.6 0.4 2.95 <0.01 <0.01
5 1.6 0.4 1.10 >0.1 >0.1

of adjusted residuals indicates that relative to the FLKNN L/2 assemblage, the FLK
“Zinj” assemblage contains more specimens displaying weathering stage 0 and fewer
displaying stages 1, 2, 3, and 4 than expected given random chance (Table 7.3). These
results identify the statistical signi¬cance of Figure 7.1 , but there is other variation
between the two weathering pro¬les that the χ 2 analysis does not capture. What
other kinds of analysis might be done?

figure 7.1. Weathering pro¬les for two collections of ungulate remains from Olduvai
Gorge. Data from Table 7.2.
quantitative paleozoology

A paleozoologist could also determine the richness of weathering stages in each,
and the evenness and heterogeneity of each as well. These values for FLK “Zinj” are 5
(richness), 0.489 (evenness), and 0.787 (heterogeneity); for FLKNN L/2 the values are
6, 0.730 and 1.309. The richness values do not tell us much by themselves. The evenness
value is greater for FLKNN L/2, indicating that it has a more even distribution of
specimens across the weathering stages than the FLK “Zinj” collection. Finally, the
heterogeneity index values conform with the combined richness and evenness values
for each and indicate that the FLKNN L/2 assemblage is more heterogeneous “ richer
and more even “ than the FLK “Zinj” assemblage. All of these values, and particularly
the evenness and heterogeneity index values, suggest that the FLKNN L/2 assemblage
is more weathered than the FLK “Zinj” assemblage, although the only way we know
this rather than it being the other way around is in light of Figure 7.1 .
Tallying so far has been straightforward. But if one were to interpret the data
in Tables 7.2 and 7.3 and the graph in Figure 7.1 as re¬‚ecting differences in bone
accumulation duration, as Potts (1986) did and Behrensmeyer (1978) hoped to do “
the basic presumption being that an assemblage with a more left-skewed weathering
pro¬le (tail to the left, maximum frequency to the right) took longer to accumulate
than an assemblage with a right skewed pro¬le (tail to the right, maximum frequency
to the left) “ a number of assumptions would have to be made. Behrensmeyer (1978)
perceived a positive relationship between how long a carcass had been lying on the
landscape (years since death) and the greatest weathering stage displayed by any one
of the bones of the carcass. The correlation between the two variables is indeed strong
and signi¬cant, as implied by Figure 7.2 (r = 0.872, p < 0.0001). Based on actualistic
research by numerous others, Lyman and Fox (1989) noted that there were a number
of variables that could reduce the correlation coef¬cient to considerably less than 1.0.
Few individuals subsequently presented, let alone interpreted, bone weathering data
as Potts (1986) had done. Rather, weathering data came to be used to gain insight
into other aspects of the taphonomic history of a bone assemblage.
Given Behrensmeyer™s (1978:153) observation that “bones are usually weathered
more on the upper (exposed) than the lower (ground contact) surfaces,” analysts
now examine which surface is more weathered and which is less weathered. Skyward
or upper surfaces should be more weathered than groundward or lower surfaces
because the upward surface is more directly exposed to sunlight, precipitation, and
other climatically related weathering agents. If the reverse is observed, if the ground-
ward surface is more weathered than the upper surface, then it is likely there has
been some postdepositional and perhaps postburial disturbance. If there are many
bones, then one could construct a weathering pro¬le like that in Figure 7.1 , but with
a distinction between the weathering stage displayed by the skyward surfaces and the
tallying for taphonomy 271

figure 7.2. Relationship between years since death and the maximum weathering stage
displayed by bones of a carcass (r = 0.872, p < 0.0001). Plotted numbers indicate frequency of
carcasses displaying a particular weathering stage and years since death. Data from Behrens-
meyer (1978).

weathering stage displayed by the groundward surfaces. An example of such a graph
using ¬ctional data is shown in Figure 7.3. What is shown is what is expected in an
assemblage that experienced minimal post depositional disturbance “ the ground-
ward surfaces are generally less weathered than the skyward surfaces.
No taphonomist has used a quantitative unit for tallying weathering data other
than NISP. Some analysts, beginning with Behrensmeyer (1978), have suggested
that perhaps frequencies of weathered long bones should be tallied separately from
frequencies of small, compact bones, such as carpals, tarsals, and phalanges, and
perhaps as well scapula, innominates, skulls, and vertebrae should be tallied as a
group separate from long bones (one group) and small bones (another group). Do
femora weather at the same rate as phalanges? Actualistic data suggest that they do
not (summarized in Lyman and Fox 1989; see also Lyman 1994c). Thus, one might
tally the number of each skeletal element displaying each weathering stage. If com-
parison of the weathering pro¬les of, say, scapulae and humeri are not signi¬cantly
different, then lump them together for a skeletal composite weathering pro¬le. Mul-
tiple such comparisons between different categories of bone size and shape may
quantitative paleozoology

figure 7.3. Weathering pro¬les based on ¬ctional data for a collection of bones with
skyward surfaces representing one pro¬le and groundward surfaces representing another

indicate that, say, small, more or less spherically shaped dense bones such as carpals,
tarsals, and phalanges, and larger, plate-shaped less dense bones, such as scapula and
innominates, may reveal differences in weathering pro¬les (for a discussion of how
to measure bone shape, see Darwent and Lyman [2002]). Similarly, one might sepa-
rately tally weathering stages evident on bones of large ungulates and those evident
on bones of small ungulates, or equids and bovids, or the like to evaluate similarities
or differences between taxa in terms of weathering.
Quanti¬cation of bone weathering data involves counting bones or counting kinds
of bones (skeletal element, or bone shape) based on the maximum weathering stage
displayed by each. The basic counting unit is NISP, but different kinds of NISP
(shape, size, taxon considered or not) may be distinguished. The entire specimen
(regardless of kind) is subject to weathering yet the longest exposed (if you will)
portion of the specimen will display the maximum degree of weathering. Recall why
the most advanced weathering staged displayed by a specimen is recorded rather
than a less advanced stage. The conceptual clarity of the relationship between the
target variable and the measured variable is what makes the quanti¬cation of bone
weathering attributes straightforward relative to some other kinds of taphonomic
tallying for taphonomy 273

attributes. The entire surface of each discrete specimen can potentially weather at the
same rate “ all surfaces are taphonomically interdependent at a general level because
they are all connected “ and although this does not seem to happen in practice
(groundward surfaces weather slower than skyward surfaces), interdependence of all
surface area of a specimen renders NISP the correct quantitative unit if one wishes
to know which of two assemblages of bones is the most weathered.
If post-depositional disturbance is of interest, then knowing the stage of weather-
ing displayed by skyward and by groundward surfaces is important. If intraskeletal
variation in weathering is of interest, then tally by distinct skeletal parts. If taxonomic
variation in weathering is of interest, then tally skeletal parts by taxon. The target
variable and how to tally should be speci¬ed by the research question. In most cases,
each specimen will be tallied based on the maximum weathering it displays. Weath-
ering stage data are generally tallied using NISP. One might tally weathering by NSP
to determine if a higher proportion of NUSP displays more advanced weathering
than NISP; if so, that would suggest long-term exposure on the ground surface and
a low NISP:NSP ratio that resulted from subaerial weathering. The interdependence
of the entire surface of a specimen dictates the tallying protocol, and it also attends
the tallying of other sorts of taphonomic modi¬cations to bones.


Some faunal remains have passed through a digestive tract and as a result have
been chemically corroded (e.g., Andrews 1990). Corrosion features on bones include
solution pits or ovoid depressions, ¬ssures that penetrate through cortical bone, and
feathered fracture surfaces (Darwent and Lyman 2002; Klippel et al. 1987; Lyman
1994c). Quanti¬cation of such observations typically involves tallying the NISP that
display digestive (or other) corrosion and then calculating the percentage of the
total NISP that display corrosion (e.g., Fernandez-Jalvo and Andrews 1992; Klippel
et al. 1987; Weissbrod et al. 2005). (Seldom is the percentage of NSP that displays
digestive corrosion reported; it might be taphonomically revealing to compare the
percentage of NISP that has been digested with the percentage of NUSP that displays
digestive damage.) Because the entire specimen is ingested, all areas of the surface
of the specimen are interdependent with respect to the action of the taphonomic
process of digestive corrosion. NISP (or NSP) thus is the logical quantitative unit for
tallying digestive corrosion.
Given that stages of the degree of corrosion can be de¬ned (e.g., Darwent
and Lyman 2002; Fernandez-Jalvo and Andrews 1992; Matthews 2002), one might
quantitative paleozoology

construct a digestive corrosion pro¬le analogous to a weathering pro¬le (Figure 7.1 ).
The frequency data used to construct the pro¬le would allow graphical and statisti-
cal comparisons between assemblages of bones that may have undergone different
levels of digestive corrosion like those applied to weathering data. We do not yet
know enough about digestion and how, say, hair as in owl pellets might buffer some
portions of a bone specimen™s surface area from corrosion. Once we know this, some
research questions may demand that the range of the degree of corrosion damage to
a specimen be recorded.
What is often referred to as root etching is another kind of corrosive damage that is
sometimes reported. Again, taphonomic interdependence of all areas of the surface of
the specimen makes NISP the quantitative unit of choice. But, upward surfaces may
display root etching whereas downward surfaces of specimens do not (this seems to
be the case). If so, and as yet we seem to know too little about root etching to be sure,
then this sort of corrosive damage could be quanti¬ed by number of skyward surfaces
and number of groundward surfaces. Until we know more about the taphonomic
process itself, quantitative units have ambiguous signi¬cance; we do not presently
know what corrosion damage quanti¬ed using NISP means. Exactly the same can be
said for frictional abrasion such as occurs during ¬‚uvial transport and other sorts of
damage that have the potential to in¬‚uence the entire surface of a specimen. Until
we know better, NISP is the quantitative unit of choice for measuring corrosion and
But there may be a better quantitative unit for such features. One might measure the
amount (proportion) of surface area that displays corrosion, erosion, and abrasion
damage, although this would require a labor-intensive analysis. Whether one uses
NISP or the amount of surface area as the quantitative unit will depend on the target
variable. Often, what that target variable might be is unclear in the literature. If the
desire is to compare two assemblages, then it may make no difference whether percent
of surface area or percent of NISP that displays a kind of damage is determined. Until
we have better knowledge of the relationship between particular measured variables
and particular target variables, using NISP will suf¬ce. The same argument applies
with equal force to quantifying the taphonomic signature of the effect of ¬re on
faunal remains.


Quanti¬cation of burned bone has, like other taphonomic features, typically involved
determination of the percentage of the total NISP that comprises burned specimens
tallying for taphonomy 275

figure 7.4. Frequency distribution of seven classes of burned bones in two kinds of
archaeological contexts. Original data from Cain (2005).

(e.g., Cain 2005; Edgar and Sciulli 2006; Grayson 1988; Pozorski 1979). One could also
tally the percentage of NSP that has been burned. Because burning is a process that
results in continuous modi¬cation of bone tissue “ burning is a result of excessive
heat “ different stages of burning can often be distinguished. One of the simplest
schemes to operationalize was speci¬ed by Brain (1981 :55) who designated three
stages: (1) unburned bone; (2) “carbonized” bone that is black because as the collagen
is burned a specimen becomes charcoal or carbonized; and (3) “calcined” bone that
is white because continued heating has oxidized the carbon. There is the potential
for even ¬ner distinctions of how intensively burned individual specimens are (e.g.,
Cain 2005; Johnson 1989; Shipman et al. 1984).
Tallying specimens by the maximum burning stage each displays, one can construct
a “burning pro¬le” much like a weathering pro¬le. Figure 7.4 shows the percentage
frequencies of bone specimens <2 cm long from the Middle Stone Age site of Sibudu
Cave, South Africa. Cain (2005) presented these data for six individual hearths and
two individual “ash complexes.” His data are summed by functional context (hearth
vs. ash complex) in Figure 7.4. That ¬gure suggests burning is similar across the two
kinds of contexts. Cain (2005) does not present his data in a manner that allows χ 2
analysis, analysis of adjusted residuals, or calculation of the evenness of representation
of burning stages such as was done with the weathering data in Figure 7.1 . These
sorts of quantitative analyses would be logical next steps.
quantitative paleozoology

It may be informative if bones are tallied in categories distinguished by burning
stage and also by taxon and skeletal element represented (or NISP and NUSP, given
that more specimens in the more advanced stages of burning in NUSP than in NISP
would suggest burning related fragmentation reduced the proportion of identi¬able
specimens). Thus, say, one could have burning pro¬les the tally categories of which
are identical to those described for weathering. And, like with weathering, record the
maximum burning stage represented over an area of at least 1 cm2 . In the absence
of insulating soft tissue, all areas of the surface of each specimen are taphonomically
interdependent and unless the specimen is quite large the entire specimen may well
undergo the same degree of heating. One could indicate on drawings of bones which
portions of each specimen are unburned, carbonized, and calcined, but without a very
speci¬c research question or hypothesis that demands such data (an explicit target
variable), and a solid (actualistically evaluated) interpretive model for making sense
of such data (mechanical linkages between burned and unburned portions of each
specimen and the heating regime), recording which surfaces are burned and which
are not, irrespective of the boundaries of the discrete specimen seems unnecessary.
One critical point concerns (in)explicit identi¬cation of the target variable. Is
it merely the proportion (or percent) of burned specimens? Is it the intensity of
burning? If intensity of burning is the target variable, then the manner in which
that variable is de¬ned by the analyst will specify the appropriate variable to mea-
sure. If intensity is de¬ned in terms of the amount of burned bone, then surface area
would be the appropriate measured variable. But if intensity is de¬ned in terms of
the number of burned specimens, then it would be better to use NSP (or NISP) to
quantify burning, and to determine the percent (or proportion) of specimens that
had been burned because that measured variable is the de¬nition of the target vari-
able. Whatever the case, the validity of all of the measured variables for assessing a
particular target variable is presently unclear because the nature of the relationship
between the two is unknown. Nevertheless, the discussion to this point warrants a
brief digression.

A Digression

All of the taphonomic processes that have been discussed thus far create modi¬cations
on the surface of a specimen, but all of them may, for myriad reasons, modify only
a fraction of the total surface area of a specimen. This fact can be taken advantage
of when tallying for taphonomy. If the portions of the total surface area of each
specimen are not interdependent with respect to the operation of a taphonomic
tallying for taphonomy 277

process or agent, then rather than tally the total NISP, the total NISP with the
modi¬cation of interest, and division of the latter by the former to derive a %NISP
with the modi¬cation, a different quantitative protocol can be applied “ this can be
referred to as the surface area solution.
The ARC GIS procedure described in Chapter 6 and ¬rst described by Marean et al.
(2001 ) might prove to be a good means for measuring the amount of total surface
area that displays maximum weathering, corrosion, burning, or other taphonomic
attributes. If this procedure is chosen, then the amount of surface area measured will
depend on how accurately specimens are “mapped” onto the computer-stored tem-
plates of skeletal elements, and also how accurately weathered, corroded, and burned
areas are mapped onto the templates. More importantly epistemologically, however,
is the fact that the taphonomic signi¬cance of measurements of amount of weath-
ered, corroded, and burned surface area is unclear. We do not know enough about
taphonomic processes to be able to interpret these surface area data with any validity.
The surface area solution is thus but one way to measure taphonomic attributes. It
may not be worth pursuing because taphonomic processes are so historically con-
tingent as to be beyond the analytical ¬ne-scale resolution provided by measures
of proportions of modi¬ed surface area (Lyman 1994c, 2004c). Description of other
means to quantify other sorts of taphonomic features will clarify this.


It has long been known that various animals gnaw bones for diverse reasons (Fisher
[1995] and references therein). Generally, damage created by rodent gnawing is readily
distinguished from gnawing damage created by carnivores (Fisher 1995; Noe-Nygaard
1989). Taphonomic knowledge is suf¬cient to allow the distinction of several kinds
of damage created by carnivores; these include punctures, furrows, and irregular
damage (e.g., Haynes 1980). The NISP (NSP, less often) of gnawed specimens is
usually tallied, and the relative (percentage) abundance of gnawed specimens [100 —
NISP gnawed/(NISP gnawed + NISP not gnawed)] determined for each taxon, for
each skeletal part or portion of a taxon, or however the analyst believes the data
should be structured (which will depend in part on the research question, which in
turn should explicate the target variable and the measured variable) (e.g., Todd et al.
1997). Thus, one could tally the frequency of femora that display punctures and the
frequency that display furrows, and compare those to the frequency of humeri that
display punctures and the frequency that display furrows, respectively. Or, determine
the relative frequency of gnawed femora and the relative frequency of gnawed humeri
quantitative paleozoology

to determine if femora were more extensively (or frequently) gnawed than humeri
(e.g., Cruz-Uribe and Klein 1994). The percentage of specimens of taxon A that have
been gnawed can also be compared with the percentage of specimens of taxon B that
have been gnawed (e.g., Cruz-Uribe and Klein 1994).
The frequency of gnawed specimens is often interpreted as a measure of the inten-
sity of gnawing. One might argue, however, that the number of gnawing marks per
gnawed specimen is a better measure of gnawing intensity and whether the remains
of one taxon or one kind of skeletal element has been more intensively gnawed than
the remains of another taxon or another kind of skeletal element, respectively. Such a
measure demands two things, one logical and one practical. The logical requirement
is that “intensity of gnawing” be clearly de¬ned. The practical requirement is that
individual gnawing marks “ ones made by each biting action or each instance of
dragging teeth across a bone surface “ be distinguishable from one another, but they
often are not. Perhaps, however, this is not really a problem. Is a gnawing mark a single
puncture, or furrow, or instance of irregular damage? Was each individual puncture
or furrow or instance of irregular damage created by one bite, or one instance of
teeth contacting bone or dragging across the surface of a specimen?
To answer the last two questions requires that we de¬ne intensity of gnawing. Most
taphonomists likely mean how damaged the specimens are or how much energy was
expended by the bone gnawer. More gnawing marks may well mean more energy was
expended, or it might not (Kent 1981 ). More gnawing marks may simply mean greater
damage to bone surfaces. This ambiguous target variable brings us back to how to
tally such damage, regardless of the meaning of that damage or its frequency. Let™s
assume we can distinguish individual gnawing marks. Whereas individual punctures
and furrows could each be tallied as “1,” the category known as irregular damage
presents a problem because individual tooth marks are typically indistinguishable
in such damage. (The lack of distinguishability is particularly acute with respect to
individual rodent gnawing marks [Thornton and Fee 2001 ].) But irregular damage
also suggests a different way to measure the intensity of gnawing damage.
One could determine the amount of surface area that has been destroyed by gnaw-
ing. Given that there is little clear taphonomic signi¬cance to the difference between
tooth punctures, furrows, and irregular damage, the amount of surface area that
has been destroyed by such modi¬cations may well provide a robust measure of the
intensity and degree and extent of gnawing damage. And, such a measure does not
demand that individual tooth marks be distinguished within an irregularly damaged
area. Rather, only the amount of the total surface area of all specimens could be
inspected for gnawing damage, and the amount (percentage) of surface area actually
damaged, regardless of the type of damage (punctures, furrows, irregular) could be
tallying for taphonomy 279

determined. One could determine such a value for both damage created by rodent
gnawing and damage created by carnivore gnawing, if desired. This brings us back
to the surface-area solution to tallying for taphonomy when burning, weathering,
and corrosion damage were under consideration. If that protocol is used, then the
problem reduces to accurate mapping of specimen borders and of boundaries of
damaged surfaces (see the discussion in Chapter 6).
Assuming that one can accurately determine the amount of damaged surface area,
other questions arise. Should the amount of corroded/abraded surface area be sub-
tracted from the total surface area inspected for gnawing damage? What about the
amount of weathered surface area? Answers to these sorts of taphonomic questions
need to be in hand before too much energy is spent designing new ways to quantify
traces of taphonomic processes and agents. Thus, for the present, the ratio of gnawed
specimens to gnawed plus ungnawed specimens, expressed as a percentage or propor-
tion, is an acceptable measure of gnawing intensity. There is a ¬nal, general category
of damage to bone surfaces that, at least from a zooarchaeological perspective, may
help bring the preceding portion of this chapter into ¬ne-resolution focus.


Butchering is the human reduction and modi¬cation of an animal carcass into usable
or consumable parts (Lyman 1987a, 1992b, 2005b). It involves the set of hominid
behaviors and activities that occur between the time of carcass procurement (regard-
less of how it is procured [e.g., hunted or scavenged] or its condition) and ¬nal dis-
posal or abandonment of variously consumed, and unconsumed, used and unused
portions of the carcass. Butchering behaviors occur in varying orders and frequencies
or intensities at various times for different carcasses because butchering is histori-
cally contingent, which means that the particular order and frequency of individual
behaviors depends on a plethora of variables such as carcass size, carcass location
on the landscape, time of day, air temperature, number of butchers, and butchering
tools available. Butchering activities have traditionally been categorized as belonging
to one of three or four basic kinds: skinning, dismemberment or disarticulation,
¬lleting or removing meat from bones, and marrow and grease extraction (Binford
1981 ; Guilday et al. 1962; Noe-Nygaard 1977; Pozorski 1979). The ¬rst two sets of
activities are focused on reducing a carcass into manageable pieces whereas the third
focuses on extraction of consumable meat external to the bones and the last set of
activities focuses on extraction of within-bone nutrients. There are a plethora of
other activities involved, including evisceration, extraction of blood, brains, bone,
quantitative paleozoology

and sinew, and periosteum removal that can take place but that can be subsumed
within one of the three or four traditionally recognized general activities.
Each butchering activity, regardless of how it is categorized, can produce what are
typically called butchering marks. Many believe that such marks can be reliably and
validly identi¬ed (e.g., Blumenschine et al. 1996; Fisher 1995), so here the focus is on
how these marks are quanti¬ed. The reasons to worry about counting butchering
marks are several. Most simply, butchering is a process; it begins with a single discrete
entity (carcass) and ends up with multiple discrete entities (disarticulated and dis-
associated complete and incomplete skeletal elements, hide, brain, marrow, muscle
masses, etc.). As butchering progresses, the carcass is reduced into successively more
numerous discrete pieces. The butchering process often involves the application of
various kinds of forces to the carcass to reduce it into consumable and usable pieces.
These forces can modify bones by breaking them and they can modify bone surfaces
by scarring them. It is the marks that are created by butchering that are of interest
here; quantitative measures of fragmentation are discussed in Chapter 6.
As the butchering process continues, more marks may be created on the bones of a
carcass. It is likely for this reason that many zooarchaeologists have sought to measure
the intensity of butchering, by which is meant, it seems, the amount of energy invested
in butchering. Butchering intensity is measured by tallying butchering damage evi-
dent on a collection of faunal remains (e.g., Haynes 2002). For example, Binford
(1988:127) suggested that “the number of cut marks, exclusive of dismemberment
marks, is a function of differential investment in meat or tissue removal.” Other
zooarchaeologists have tallied frequencies of various kinds of butchering marks for
other reasons. Bunn and Kroll (1986:432), for example, state that “frequencies of
cut marks on different skeletal parts can be directly linked to the skinning, disar-
ticulation, and de¬‚eshing of carcasses” and that multiple occurrences of marks in a
particular anatomical location indicate, say, “repeated dismemberment of the elbow
joint.” Regardless of the reason for tallying frequencies of butchering damage, if they
are to be tallied, we must ¬rst have explicit de¬nitions of what the various kinds of
damage are. It is to that topic that we turn next.

Types of Butchering Damage

There are several variables that may be considered when tallying butchering marks,
but one of them is virtually always considered. That variable concerns the type of
mark. There are several basic kinds of marks the morphologies of which are dependent
on the type of force and aspects of force application used to create them (Fisher 1995;
tallying for taphonomy 281

Green¬eld 1999; Lyman 1987a; Noe-Nygaard 1989; Potter 2005; Thompson 2005). It
is likely because of the different kinds of force and different ways that force is applied
through an intermediary (tool) to a bone surface that most zooarchaeologists can
reliably and validly distinguish the various mark types that have been recognized
(Blumenschine et al. 1996).
One kind of force involves dynamic percussion, such as when a hammer stone
impacts a bone resting on a ¬rm surface. This type of force application involves
a more or less blunt (as opposed to sharp-edged) implement and abrupt dynamic
loading (impact) that produces impact notches, ¬‚ake scars, and various scratches
(Blumenschine and Selvaggio 1988; Pickering and Egeland 2006). Another kind of
force involves sawing or slicing forces that produce what are termed “cut marks” or
“striae.” Sawing and slicing involves force application parallel to the long axis of the
cutting edge of the tool. Scraping is similar to slicing and sawing, though the latter
two are generally back and forth whereas the former is generally in one direction
and force is applied perpendicular to the long axis of the implement™s working edge.
Chopping is dynamic loading with a sharp edge; Gifford-Gonzalez (1989) considers
it to be a cutting-like process, and although it can be, I conceive of it as something
of a hybrid between cutting and percussion.
Percussion marks, cut marks, and scraping marks tend to have been produced in
prehistoric contexts by butchers with primitive (preindustrial) technologies. Chop-
ping marks are made in such contexts as well, but they are also made in historic
contexts by butchers with industrial-grade (metal) technologies (Landon 1996). So,
too are saw cuts. By the latter is meant cuts made with metal saws (e.g., Lyman 1977).
Saw cuts can be tallied in various ways, most of which are the same as the ways used to
tally cut marks, percussion marks, and chopping and scraping marks. For the sake of
simplicity, discussion is limited to percussion marks and cut marks in the following.

Tallying Butchering Evidence: General Comments

A classic statement in zooarchaeology is this: “It is quite possible to butcher an
animal of any size without leaving a single [butchering] mark on any bone” (Guilday
et al. 1962:64). This claim was reiterated at least twice more in later years (Bunn
and Kroll 1988; Crader 1983), so it is perhaps not surprising that numerous analysts
subsequently suggested various reasons why a bone might not display butchering
marks despite the fact that the portion of the carcass represented by that bone had
apparently been butchered. Shipman and Rose (1983:86) suggested that “soft tissues
have an ability to shield bones from being marked by bone or stone tools.” They found
quantitative paleozoology

in an experimental context that even the periosteum (a <1 mm thick, soft-tissue
covering of bone) shields bone surfaces from cut marks (Shipman and Rose 1983:70).
Gifford-Gonzalez (1989:202) made a similar observation in an ethnoarchaeological
context. Olsen and Shipman (1988:545) argued “butchering requires a light touch to
prevent crushing and dulling the tool™s edge by contact with the bone,” and a butcher™s
desire to not dull a cutting tool would result in few butchering marks. Guilday et al.
(1962:64) thought that the probability that a bone will display a butchering mark is
a function of “the skill of the [butcher]. [Further,] the more hurried or careless the
process the greater the probability that the bone will [display a butchering mark]”
(see also Maltby 1985:22). Gilbert (1979:235) echoed this when he noted that butcher-
ing marks were likely created by “the sloppiest efforts at carcass division.” Finally,
Maltby (1985) underscores the fact that it is possible that all carcasses represented in
a collection were not butchered in like manners. All of these statements presume that
although all bones are butchered (speaking metaphorically; see next paragraph),
only some of them “ for various reasons “ in a collection will sustain butchering
marks. This presumption as yet has no empirical (actualistic) basis. Nevertheless,
when seeking to measure the “intensity” of butchering by quantifying butchering
damage, a seldom acknowledged assumption is required.
Most analysts (implicitly) assume that given some set of bones X, some subset X
of those bones will be butchered, and of those butchered bones some subset X will
sustain damage in the form of butchering marks. The critically important assumption
during analysis and interpretation, then, is that some proportion of each category of
skeletal part was butchered and some lesser proportion will display butchering marks,
and those two proportions will directly and positively covary at least at an ordinal
(but likely not a ratio) scale. I am speaking metaphorically when I say that bones are
butchered because it is actually carcasses and carcass parts that are butchered, with
the notable exception of fracturing of bones for purposes of marrow extraction and
grease rendering (e.g., Noe-Nygaard 1977). I use the metaphorical shorthand form
bones are butchered here for convenience and ef¬ciency.
A ¬ctitious example will make clear the signi¬cance of the requisite analytical
assumption. Let™s say that there were ten femora and ten humeri available for
butchery (X), and all are present in the archaeological collection we are studying. Of
those, six femora and ¬ve humeri were in fact butchered (X ); say, for example, that
¬‚esh was removed from them. Of those butchered elements, for whatever reason(s),
only four femora and two humeri display butchery marks (X ). The critical statistical
relation here is that more femora than humeri were butchered, and that more femora
than humeri display (archaeologically visible) butchery marks. Sixty percent of the
observed femora and 50 percent of the observed humeri were butchered, but in fact
tallying for taphonomy 283

only 40 percent of the observed femora and 20 percent of the observed humeri display
butchering marks. Thus we could say that femora were more intensively butchered
than humeri because proportionally more of the femora than humeri display butcher-
ing marks (where intensity concerns the amount of energy invested; more butchering
marks and more butchery marked bones are thought to signify more energy). But
if in fact six of ten femora were butchered and ¬ve of ten humeri were butchered,
but only one femur displays butchering marks and three humeri display butchering
marks, then we would be wrong to conclude that humeri were more intensively
butchered than femora (discussion derived from Lyman 1992b, 1995b). Notice that
“quantifying butchering damage” was said rather than “quantify butchering marks”
or “tally butchery marked bones.” The latter two are often used as synonymous
when in fact it should be (and will become) clear that they are quite different.
The preceding discussion focuses on tallying the number of skeletal elements that
display butchering marks. This counting procedure mimics those used to quantify
burning, corrosion, gnawing, and the like. In all cases, the tallying procedure pro-
vides data that answer the question: What proportion (or percentage) of specimens
(usually identi¬able, or NISP) display a particular kind of taphonomic modi¬cation?
However, the target variable seems, based on inferences attending observations of
percentages of butchery marked specimens, to be the intensity of butchering, which
is seldom clearly de¬ned but based on published interpretations and a few explicit
statements involves the amount of energy spent (e.g., Haynes 2002). Some analysts
have therefore worried that tallying the number of butchery marked bones does not
actually measure the target variable but something else. These individuals argue that
to measure the intensity of butchering, one needs to tally the number of butchering
marks so as to have quantitative data that actually re¬‚ect the intensity of butchering
(Abe et al. 2002; Marean et al. 2001 ). This is an important observation about the rela-
tionship (or lack thereof) between a measured variable (NISP of butchery marked
bone) and a target variable (intensity of butchering).

Tallying Percussion Damage

The morphometric criteria for identifying ¬‚ake scars and percussion damage are
spelled out in various places (e.g., Blumenschine and Selvaggio 1988, 1991 ; Capaldo
and Blumenschine 1994; Fisher 1995). Typically, zooarchaeologists have tallied the
NISP (NSP more rarely) displaying ¬‚ake scars, percussion notches, and other
percussive damage, and then calculated the proportion or percentage of NISP that dis-
plays such marks. Some individuals have tallied the number of marks (e.g., Kooyman
quantitative paleozoology

2004), even though it is likely that the number of marks is at least partially a func-
tion of the number of specimens examined. Furthermore, some notches overlap
one another, and sometimes a single blow will produce a nested series of ¬‚ake scars
(Capaldo and Blumenschine 1994), although perhaps with suf¬cient training and
experience potential dif¬culties with identifying and counting ¬‚ake scars and per-
cussion marks might be minimized (Blumenschine et al. 1996). Recent experimental
work suggests that tallying percussion damage as the number of distinct marks may
be accomplished rather accurately, but the frequency of percussion marks did not
correlate with the number of hammerstone blows administered to bone specimens
in one set of experiments (Pickering and Egeland 2006). Therefore, for the present,
interpreting percussion-mark frequencies (rather than number of specimens with
percussion marks) in terms of intensity of butchering (energy invested) is precluded.
There is no empirically demonstrable relationship between the target variable and
the measured variable.
One might choose to tally the frequency of percussion-damaged specimens across
different skeletal elements of a taxon, or across a common set of skeletal elements of
several taxa. The analyst might wonder if more humeri specimens, say, display per-
cussion damage than do femora specimens of deer. Alternatively, one might wonder if
more long bones of wapiti display percussion damage than do the long bones of deer;
wapiti tend to be two to four times larger than deer. In one set of collections, I found
that deer long bones had signi¬cantly more ¬‚ake scars than did wapiti long bones, but
in another set of collections exactly the opposite situation was found (Lyman 1995b).
Why this was the case seemed to relate to the kinds of other resources that were
exploited, but lack of actualistic data linking the variables precluded straightforward
In sum, there are two basic ways to record percussion damage “ as the number of
damaged specimens (generally reported as %NISP that has such damage), and the
number of instances of force application manifest as individual ¬‚ake scars, percussion
notches, and the like. Explicit statement of a research question will help explicate the
target variable and an appropriate measured variable. The relationship between the
two, however, may well be unknown, and experimental work is needed in such cases
to establish that relationship.

Tallying Cut Marks and Cut Marked Specimens

The morphometric criteria for identifying cut marks are described in numerous
places (e.g., Blumenschine et al. 1996; Fisher 1995; Green¬eld 1999; Lyman 1987a;
tallying for taphonomy 285

Shipman and Rose 1983), and the identi¬cation of such marks is seldom questioned
these days. What is receiving the most analytical attention in the ¬rst decade of the
twenty-¬rst century is how to count cut marks. There are several ways that cut marks
have been tallied. Seldom is the proportion of butchery marked skeletal elements (not
specimens) determined (see Todd et al. [1997] for an example of tallying cut marked
elements). Sometimes, the %NISP that display cut marks is calculated, but that is
perhaps not a good procedure given the potential for variation in fragmentation
either across the different skeletal elements of a taxon or across different taxa (Abe
et al. 2002). What many analysts do is specify some speci¬c anatomical area or
portion, whether, say, the distal humerus or the greater trochanter of the femur (an
anatomical area or portion) or diaphysis fragment of the tibia, and then determine
how many of each of those portions display cut marks (e.g., Guilday et al. 1962; Lyman
1992b; Snyder and Klippel 2003). This assists with keeping track of the anatomical
distribution of cut marks. Are they all on diaphyseal pieces, or half on epiphyses and
half on diaphyses, and do proportionately more proximal femora have cut marks
than distal femora? Thus, if two of ¬ve distal humeri display cut marks, then one
would conclude that 40 percent of the distal humeri in the collection have butchering
marks. The fact that three of those ¬ve humeri are complete skeletal elements, one
consists of the distal end and distal one third of the diaphysis, and the ¬fth consists
of just the distal condyle is irrelevant to quantifying cut-mark data when they are
tallied by an anatomical location of relatively greater or lesser speci¬city (Lyman
Other analysts tally the number of individual cut marks. For example, Milo
(1998:109) argued that, based on his own butchering experiments, “the relative effort
put into cutting in different areas is best re¬‚ected by the number of times the tool
scored the bone.” But Milo (1998) worried that differential representation of skeletal
parts would skew tallies of individual marks. To avoid this sort of problem, Bunn
(2001 ) tallied the total number of cut marks observed on each kind of skeletal ele-
ment (e.g., humeri, femora). He then divided each kind of skeletal element (he
was dealing solely with limb bones) into ¬ve, more-or-less equal-sized areas: prox-
imal end, proximal shaft, midshaft, distal shaft, and distal end. The underpinning
assumption to the ¬ve areas is that cut marks near the ends of long bones likely have
something to do with disarticulation whereas those on shafts result from de¬‚eshing.
Finally, Bunn determined the percentage of all cut marks per kind of skeletal element
that occurred in each of the ¬ve areas. This is indeed one way to contend with the
differential representation of skeletal parts.
There are other ways that the number of cut marks might be tallied and analyzed,
but a potentially signi¬cant problem attends any such tallying, regardless of how
quantitative paleozoology

those tallies are mapped on anatomy or analyzed. If each individual cut mark or
single striation is to be tallied, then they must somehow be distinguished for tallying
purposes. The problem arises when cut marks overlap. If, for example, a sawing like
motion is used “ the cutting tool™s edge drags across the bone surface on both push
and pull strokes that overlay each other “ then strokes producing striae made later
in the sequence of strokes may obliterate or at least obscure striae made by earlier
strokes. There is no experimental work that evaluates this possibility, and no one
has assessed a paleozoologist™s ability to accurately count individual cut marks. But
this may be the least of our concerns. For now, however, let us assume that we can
tally the number of individual cut marks (each representing a distinct arm stroke).
How, then, might we obtain counts of cut marks as opposed to counts of cut marked
specimens or tallies of cut marked anatomical areas?

The Surface Area Solution

Recently, the argument has been made that variation in the representation of sur-
face area is relevant to tallying frequencies of cut marks. Abe et al. (2002:650) are
concerned about what they refer to as the “fragmentation dilemma.” In particular,
they are worried that “fragmentation generally decreases the number of cut marked
fragments and cut mark counts relative to total fragments” (Abe et al. 2002:649).
Fragmentation generally means breakage such that what was a single discrete object
after fragmentation comprises multiple discrete objects (Lyman 1994c:509). Frag-
mentation destroys the original integrity of a discrete object, but the material or
substance comprising that discrete object still exists and, importantly, some of the
original integrity of the discrete object may also remain. Destruction generally means
the complete loss of the original integrity of the object, such as when the specimen
is crushed into dust or into pieces that are so small that they cannot be identi¬ed
and thus are analytically invisible. Abe et al. (2002:649) state “the fragmentation pro-
cess moves fragments into the unidenti¬able category and destroys less-dense bone
altogether.” They have in mind extreme fragmentation or destruction in the sense
that the specimen is analytically invisible. This process was recognized long ago by
Watson (1972; see also Lyman and O™Brien 1987). Less extreme fragmentation, such
as when a bone specimen is broken into two or three pieces that can be identi¬ed
to skeletal portion, may increase the number of cut marked specimens if a fracture
plane truncates a cut mark such that half of the mark occurs on one specimen and
the other half occurs on another specimen. Re¬tting specimens is the only way to
correct for this.
tallying for taphonomy 287

Given their concern about the destruction of cut marks, Abe et al. (2002:650)
suggest “the likelihood of a cut mark being preserved and counted by an analyst
is a function of the amount of bone surface area studied and recorded.” This is
commonsensical “ the more surface area examined, the more cut marks will be
found. Abe et al. (2002:650) use this observation, however, to argue that (i) because
more cut marks will be found if more surface area is examined, (ii) if we determine
the density of cut marks per unit of surface area in an anatomical region, (iii) then “we
can correct the number of cut marks by the amount of examined surface area, much
as demographers standardize population size by estimating population density.” In
particular, they assume that if half (50 percent) of the potential surface area of an
anatomical region has been examined, and ten cut marks have been tallied, then were
100 percent of the surface area of that region examined, twenty cut marks would be
tallied. They are assuming that the density of cut marks on an observed sample is
the density of cut marks on the unobserved remainder of the population. They are
assuming precisely what they are trying to discover “ the original (predestruction)
frequency of cut marks (see also Lyman 2005b).
Abe et al. (2002) are trying to take advantage of the visibility of one variable “
frequencies of cut marks on observable bone surface area “ in order to measure a
variable that is invisible “ cut mark frequencies on missing or destroyed bone surfaces.
There are a plethora of problems with this procedure. The ¬rst problem is that one
must decide what comprises the sample of specimens to be examined for any given
analysis, and thus one must decide how to de¬ne an aggregate of remains. If different
specimens are included in a sample, different results are likely to attend analysis of cut
mark frequencies. The second problem concerns how to de¬ne anatomical regions
for which the amount of observed surface area will be determined. The proportion
of observed surface area is based on the maximum MNE for any given anatomical
region (e.g., proximal end, proximal shaft, mid shaft). Thus, if there is evidence of ten
distal humeri (= MNEmax), then there should be ten proximal humeri represented
even if only two are observed. But, are proximal ends and distal ends, proximal shafts
and distal shafts and mid shafts, such as proposed by Abe et al. (2002), appropriate
tallying units? That is presently unclear.
The third problem is that the analytical procedure ignores the historically con-
tingent nature of butchering episodes (Lyman 1987a, 2005b). It is easy to show that
even in experimentally controlled situations, for reasons that are unclear, there is a
tremendous range of variation in the frequency of cut marks generated in any given
butchering episode. Consider the experimental data generated by Pobiner and Braun
(2005) and summarized in Table 7.4. Those data are the number of cut marks gener-
ated during the de¬‚eshing of six goat hindlimbs (femora and tibiae). Each hindlimb
quantitative paleozoology

Table 7.4. Frequencies of cut marks per anatomical area on six experimentally
butchered goat (Capra hircus) hindlimbs. %FR, amount of ¬‚esh removed from
femur prior to butchering. N-CM, number of cut marks; P, proximal end; PS,
proximal shaft; MS, mid shaft; DS, distal shaft; D, distal end. Data from
Pobiner and Braun (2005)

Limb Element % FR P PS MS DS D
1 Femur 50 0 0 13 15 0
2 Femur 50 0 3 11 0 0
3 Femur 25 0 1 8 3 0
4 Femur 25 0 0 0 5 3
5 Femur 0 0 5 21 2 0
6 Femur 0 22 6 6 19 0
1 Tibia 0 0 0 0 0 0
2 Tibia 0 0 2 0 5 0
3 Tibia 0 0 2 0 0 0
4 Tibia 0 0 7 13 0 0
5 Tibia 0 0 20 3 0 0
6 Tibia 0 0 0 0 10 0

was de¬‚eshed independently of every other hindlimb, and although the amount of
¬‚esh on the femur varied when each butchery event began, nothing else did. If Abe
et al. (2002) are correct that the observed density of cut marks (frequency per unit
area) can be used to estimate the frequency of cut marks that have been destroyed,
then there should be minimal variation in the number of cut marks per anatomical
region described in Table 7.4, given that those anatomical regions are identical from
specimen to specimen in terms of surface area.
The data in Table 7.4 indicate that there is a great deal of variation in the density of
cut marks, or the number of cut marks per unit of surface area even when long bones
are treated as comprising ¬ve distinct regions (proximal and distal ends, proximal
and distal shafts, mid shaft). And this is so regardless of whether the amount of
meat on a bone was similar from case to case or was different from case to case.
Frequencies of cut marks in a given region on individual femurs range from zero to
twenty-two (proximal femur), and on individual regions of tibiae they range from
zero to twenty (proximal shaft). Given that the amount of surface area of, say, the
proximal tibia shaft does not vary signi¬cantly across the six specimens, following
Abe et al.™s (2002) suggested procedure, were only the proximal shaft of tibia ¬ve
tallying for taphonomy 289

recovered, its twenty cut marks would suggest that there were twenty cut marks on
each of the other missing proximal shafts of tibia (based on an MNE of six total
recovered distal tibiae). Data in Table 7.4 indicate that such an inference is incorrect.
The fourth problem that attends determination of the number of missing cut
marks based on observable frequencies of cut marks per unit of surface area is that it
is not at all clear what the visible frequencies of cut marks are measuring. Abe et al.
(2002:657) state that a “key assumption that all zooarchaeologists make is that more
intensive cutting (more cutting actions) results in higher frequencies of cutmarks
on the bone surface.” This is indeed a key assumption. Given that creating two cut
marks requires two arm strokes, but creating one cut mark requires one arm stroke,
it is likely that what most analysts mean by intensity is number of arm strokes. The
analytical assumption in a paleozoological context, then, must be that as the number
of arm strokes or slices increases, so too does the number of cut marks created.
Unfortunately, experiments by Egeland (2003) indicate that there is no relationship
between the number of arm strokes used to butcher limbs of large mammals and the
number of cut marks that are generated.
Egeland (2003) butchered sixteen partial and complete limbs (fore and hind) of
domestic cows (Bos taurus) and domestic horses (Equus caballus). Stone tools were
used to remove ¬‚esh, arm strokes aimed at ¬‚esh removal were tallied, and the amount
of ¬‚esh removed was recorded (Table 7.5). There is neither a statistically signi¬cant
relationship between the number of arm strokes and the number of cut marks created
across the ten multiskeletal element limbs Egeland butchered (r = “0.206, p = 0.52),
nor is there a statistically signi¬cant relationship between the number of arm strokes
and the number of cut marks created across the 31 individual skeletal elements
Egeland butchered (r = “0.20, p = 0.28). These results do not change if the data are
log-transformed (Figure 7.5). This means that when we tally cut marks, we cannot
conclude that more cut marks on skeletal parts comprising the ankle joint than on
skeletal parts comprising the wrist joint means that the ankle was more intensively
butchered than the wrist. There is no actualistic research indicating the validity of the
relationship between the two variables (measured = number of cut marks; target =
number of arm strokes or intensity) and there are actualistic data (Egeland™s) which
show that at least some times there is no such relationship at all.
In sum, then, the surface area solution proposed by Abe et al. (2002), although
perhaps solving various problems that attend tallying the number of specimens that
have cut marks, introduces problems of its own. It is dependent on the aggregate of
specimens included, it is dependent on how skeletal regions are de¬ned, it ignores
the historically contingent and variable process of butchering, and one ultimately
assumes what one is trying to ascertain. The last is so because the analytical protocol
quantitative paleozoology

Table 7.5. Frequencies of arm strokes and cut marks on sixteen limbs of cows and
horses. Number in ¬rst column identi¬es the unique butchering episode. Data
from Egeland (2003)

N of Cut Meat Removed
Limb/Element Taxon N of Strokes Marks (kg)

1 hindlimb cow 3747 11
2 forelimb/scapula horse 535 8 8.60
2 forelimb/humerus horse 877 7 7.10
2 forelimb/radius-ulna horse 525 44 2.80
3 hindlimb/tibia horse 577 8 3.40
4 hindlimb/tibia horse 582 14 2.50
5 hindlimb/tibia horse 594 1 3.80
6 hindlimb/tibia horse 202 1 0.50
7 hindlimb/femur horse 2155 3 25.7
7 hindlimb/tibia horse 420 29 4.5
8 hindlimb/femur horse 1757 0 17.5
8 hindlimb/tibia horse 650 2 2.9
11 hindlimb/femur cow 687 31 17.4
11 hindlimb/tibia cow 715 22 4.1
12 forelimb/scapula cow 395 5 6.1
12 forelimb/humerus cow 371 7 4.8
12 forelimb/radius-ulna cow 362 8 1.9
13 forelimb/scapula horse 739 26 3.1
13 forelimb/humerus horse 1124 9 4.4
13 forelimb/radius-ulna horse 586 4 2.1
14 forelimb/scapula horse 5397 13 8.4
14 forelimb/humerus horse 2265 17 5.5
14 forelimb/radius-ulna horse 2080 9 2.4
15 forelimb/scapula cow 986 0 3.0
15 forelimb/humerus cow 532 0 6.3
15 forelimb/radius-ulna cow 951 0 2.3
19 forelimb/scapula cow 148 20 1.1
19 forelimb/radius-ulna cow 178 33 0.7
21 forelimb/scapula cow 596 31 5.2
21 forelimb/radius-ulna cow 695 31 2.4
22 forelimb/scapula cow 502 107 5.8
22 forelimb/radius-ulna cow 877 29 2.3
tallying for taphonomy 291

figure 7.5. Relationship between number of arm strokes and number of cut marks on
thirty-one skeletal elements (r = “0.235, p = 0.2). Data from Table 7.5.

demands the assumption that a sample of bone surface area gives an accurate estimate
of the density of cut marks across the total (population™s) surface area of bones,
whether those bones are present, destroyed, or not collected. This might be so if
cut marks were randomly distributed across bone surfaces, but this is unlikely to be
true for many reasons and empirical data indicate it is not true. In short, we cannot
assume what we are trying to discover.


One driving force behind study of the frequencies of bone specimens displaying
butchery damage and frequencies of specimens displaying carnivore damage con-
cerns the roles of meat eating and of carcass acquisition (hunting or scavenging)
in hominid evolution (see Dom´nguez-Rodrigo [2002] and Lupo and O™Connell
[2002] for recent reviews). The underlying assumption comprises two interrelated
parts. First, if carnivores have access to a prey carcass before tool-carrying butch-
ers, prey bones will have many tooth marks but few butchering marks; if hominid
butchers have access to a prey carcass prior to access by a carnivore/scavenger, then
quantitative paleozoology

bones of prey will have many butchering marks (especially cut marks representing
de¬‚eshing of meat-rich proximal limb elements) and few tooth marks of carnivores.
Second, the more ¬‚esh on bones, the more cut marks are expected. Each of the pre-
ceding statements is carefully phrased; each refers to the frequency of marks, not the
frequency of marked skeletal elements or skeletal specimens. The target variable is
clear “ how many marks are there per specimen “ the signi¬cant assumption is that
more ¬‚esh results in more marks, whether cut marks or tooth marks. Variation in
the relative frequencies of the two kinds of marks depends on order of access and
the amount of ¬‚esh remaining that the second carnivore (whether a quadruped or
biped) can exploit.
The critical assumption is worded so as to emphasize that frequencies of marks “
whether butchering marks or tooth (gnawing) marks “ is the critical variable, that is
in fact also how individuals who have debated the issue phrase the assumption (e.g.,
Binford 1986, 1988; Bunn and Kroll 1986, 1988; Dom´nguez-Rodrigo 2002; Lupo and
O™Connell 2002; Pobiner and Braun 2005; Selvaggio 1994, 1998; Thompson 2005).
But almost without fail, paleozoologists involved in the discussion do not tally up cut
marks and tooth marks; instead they tally up cut marked bones and tooth marked
bones and analyze those frequencies. The relationship between the number of marked
bones (measured variable) and the property or process of interest (target variable)
is obscure. Furthermore, the target variable is inexplicit “ is it the amount of meat
associated with a bone, the size of the carcass, the size of the bone “ and this contributes
to the obscure relationship between it and the measured variable. Some examples
will make this clear.
Among the data in Table 7.5, the number of strokes necessary to de¬‚esh an indi-
vidual skeletal element is signi¬cantly correlated with the amount of meat removed
from the element (r = 0.365, p = 0.044), especially if both variables are log trans-
formed (r = 0.592, p = 0.0005; Figure 7.6). The number of cut marks per skeletal
part, however, is not correlated with the amount of meat removed from a bone (r =
“0.079, p = 0.67), and this holds true for the log transformed data as well (Figure 7.7).
These results suggest the interpretive assumption that more cut marks means there
was more meat on bones for stone-tool wielding butchers to remove is unfounded.
However, the data in Table 7.6 may support the assumption. Those data were gener-
ated by Pobiner and Braun (2005), who provided eighteen hindlimbs comprising the
femur and tibia to stone-tool wielding butchers. But, before the limbs were turned
over to the butchers, different amounts of ¬‚esh were removed by Pobiner and Braun
to simulate early access to fully ¬‚eshed limbs, later access to partially ¬‚eshed limbs (25
percent of ¬‚esh removed prior to butchery), and still later access to rather de¬‚eshed
limbs (50 percent of ¬‚esh removed prior to butchery).
figure 7.6. Relationship between number of arm strokes necessary to de¬‚esh a bone and
the amount of ¬‚esh removed (r = 0.592, p = 0.0005). Data from Table 7.5.

figure 7.7. Relationship between number of cut marks and the amount of ¬‚esh removed
from thirty-one limb bones (r = “0.035, p = 0.85). Data from Table 7.5.

quantitative paleozoology

Table 7.6. Number of cut marks generated and amount of meat
removed from eighteen mammal hindlimbs (femur + tibia) by
butchering. Data from Pobiner and Braun (2005)

Meat removed
Limb Taxon N of cut marks
1 cow (juvenile) 14.25 15
2 cow (juvenile) 12.00 14
3 cow (juvenile) 4.25 56
4 cow (juvenile) 3.25 15
5 cow (juvenile) 1.00 20
6 cow (juvenile) 0.50 69
7 goat 0.390 31
8 goat 0.538 14
9 goat 0.814 12
10 goat 0.786 8
11 goat 1.084 28
12 goat 1.010 53
13 zebra 23.0 73
14 zebra 23.5 67
15 zebra 13.5 90
16 zebra 14.0 160
17 zebra 23.0 95
18 zebra 23.0 45

Note, the amount of meat removed varies intrataxonomically because
of prebutchery meat removal aimed at testing the hypothesis that the
amount of meat remaining for removal would correlate with the
number of cut marks.

Pobiner and Braun (2005) conclude that there is no relationship between the
number of cut marks and amount of meat removed within each size class of butchered


. 8
( 10)