Package 'LOST'

Title: Missing Morphometric Data Simulation and Estimation
Description: Functions for simulating missing morphometric data randomly, with taxonomic bias and with anatomical bias. LOST also includes functions for estimating linear and geometric morphometric data.
Authors: J. Arbour, C. Brown
Maintainer: J. Arbour <[email protected]>
License: GPL (>= 2)
Version: 2.1.1
Built: 2024-11-02 03:12:02 UTC
Source: https://github.com/cran/LOST

Help Index


Missing morphometric data simulation and estimation

Description

LOST includes functions for simulating missing morphometric data randomly, with taxonomic bias and with anatomical bias as described by Brown et al. 2012. This package also includes functions for estimating missing morphometric data based on regression analysis and a function for checking the percentage of missing data in a matrix.

Author(s)

J. Arbour and C. Brown

Maintainer: [email protected]

References

Arbour, J. and Brown, C. 2014. Incomplete specimens in Geometric Morphometric Analyses. Methods in Ecology and Evolution

Brown, C., Arbour, J. and Jackson, D. 2012. Testing of the Effect of Missing Data Estimation and Distribution in Morphometric Multivariate Data Analyses. Systematic Biology 61(6):941-954.


Procrustes superimposition of landmark datasets with some missing values

Description

This function carries out a generalized procrustes superimposition on all fully complete specimens and produces a consensus configuration (using "Shapes" procGPA). Each incomplete specimen is then individually rotated and aligned with the consensus configuration based on any landmarks are available (using "Shapes" procOPA). Data is returned superimposed.

Usage

align.missing(X)

Arguments

X

An l X 2 (or 3) X n array of coordinate data, where n is the number of specimens and l is the number of landmarks.

Value

Returns An l X 2 (or 3) X n array of coordinate data

Author(s)

J. Arbour

References

Arbour, J. and Brown, C. 2014. Incomplete specimens in Geometric Morphometric Analyses. Methods in Ecology and Evolution 5(1):16-26.

See Also

MissingGeoMorph

Examples

data(dacrya)

## make some specimens incomplete
dac.miss<-missing.data(dacrya,remsp=0.2,land.vec=c(1,2,3,4,5,6))

## align all specimens
dac.aligned<-align.missing(dac.miss)

Estimate missing morphometric data with a highly correlated variable

Description

Estimates missing morphometric using regression on the most highly correlated morphological variable available

Usage

best.reg(x)

Arguments

x

A n X m matrix of morphometric data with n specimens and m variables, containing some percentage of missing values input as NA

Value

Returns a n X m matrix containing both the original morphometric values as well as estimates for all previously missing values.

Author(s)

J. Arbour and C. Brown

References

Brown, C., Arbour, J. and Jackson, D. 2012. Testing of the Effect of Missing Data Estimation and Distribution in Morphometric Multivariate Data Analyses. Systematic Biology 61(6):941-954.

See Also

est.reg


Align a bilterally symmetric landmark configuration with a plane

Description

Aligns a bilaterally symmetric landmark dataset to a specific plane by minimized the sum of squared distances of one coordinate (x, y or z). Useful for averaging bilateral landmarks or in preparation for correcting for artifacts like bending.

Usage

bilat.align(coords, land.pairs, average = TRUE, restricted = NULL)

Arguments

coords

Either a matrix or array of landmark data with columns representing the x, y, z coordinates and rows representing landmarks. See details for how this is applied for a single vs. multiple specimens.

land.pairs

A 2 column matrix indicating bilaterally paired landmarks. All "left" landmarks should be in the same column (and likewise for "right landmarks")

average

An optional term indicating that bilaterally paired landmarks should be mirrored and averaged, leaving only one "side" and the midline landmarks.

restricted

A set of row numbers indicating which landmarks should be considered by "optim" when selecting the optimal rotation. Typically landmarks representing a rigid structure if some landmarks represent articulated/moveable features.

Details

If a matrix for a single specimen's landmarks is provided this is aligned to a plane, if an array of multiple specimens is provided, these should be previously aligned with Procrustes superimposition, and the entire configuration is optimized with a single rotation applied to all specimens. SS are minimized across the third axis (coords[,3] or coords[,3,]).

Value

A matrix or array giving the rotated landmark configuration

Author(s)

J.H. Arbour

References

Arbour,J.H. In Prep. Get Unbent! R Tools for the removal of arching and bending of fish specimens in geometric morphometric shape analysis

See Also

unbend.spine, unbend.tps.poly

Examples

library(rgl)
data(darters)
## align darter configuration by head landmarks (restricted)
aligned<-bilat.align(darters$coords[,,1],
darters$land.pairs,average=FALSE,darters$restricted)

plot3d(aligned, aspect=FALSE)

Simulate missing morphometric data with taxonomic bias

Description

This function simulates higher frequency of missing data points in groups that are less numerically well represented in the whole sample, relative to other group. These groups may represent taxa (as used in Brown et al., 2012), but may also represent any other group of interest (e.g. populations, trials, subsamples, etc.). From a morphometric dataset, this function first selects a number of specimens to have data points removed from at random. A vector containing the number of measurements to remove from each specimen is sorted into descending order. Specimens are then sampled without replacement with a probability relative to the sum of the entire sample sizes divided by the number of specimens its respective group. The order the specimens are sampled determines the number of data points to be removed (i.e. the first to be sampled has the most removed). A complete mathematical description may be found in Brown et al. (2012).

Usage

byclade(x, remperc , groups)

Arguments

x

A n X m matrix of morphometric data with n specimens and m variables. Or an l X 2 or 3 X n array of geometric morphometric coordinates (2D or 3D), where l is the number of landmarks.

remperc

The percentage of data to be removed from the matrix, expressed as a decimal (ex: 30 percent would be entered as 0.3)

groups

A vector of length n specifying taxonomic group membership as integers (ex: c(1,1,2,2,3,3,...) )

Value

returns a matrix or array (depending on input) of morphometric data with missing variables input as 'NA'

Author(s)

J. Arbour and C. Brown

References

Brown, C., Arbour, J. and Jackson, D. 2012. Testing of the Effect of Missing Data Estimation and Distribution in Morphometric Multivariate Data Analyses. Systematic Biology 61(6):941-954.

See Also

missing.data,obliterator


Remove incomplete specimens from a landmark dataset

Description

This function takes a dataset containing both complete and incomplete specimens and removes all incomplete specimens.

Usage

complete.specimens(dataset, nlandmarks)

Arguments

dataset

A n* l X 2 matrix of coordinate data, where n is the number of specimens and l is the number of landmarks. All landmarks from one specimen should be grouped together.

nlandmarks

The number of landmarks per specimen

Value

Returns an c * l X 2 matrix of landmark data, where c is the number of complete specimens and l is the number of landmarks.

Author(s)

J. Arbour

References

Arbour, J. and Brown, C. In Press. Incomplete specimens in Geometric Morphometric Analyses. Methods in Ecology and Evolution

See Also

align.missing, MissingGeoMorph


Crocodile morphometrics

Description

A linear morphometric dataset featuring 23 cranial measurements from 223 specimens representing 21 crocodilian species.

Usage

data(crocs)

Format

A n X m dataframe, where n is the number of specimens and m is the number of variables.

Source

http://datadryad.org/resource/doi:10.5061/dryad.m01st7p0

References

Brown, C., Arbour, J. and Jackson, D. 2012. Testing of the Effect of Missing Data Estimation and Distribution in Morphometric Multivariate Data Analyses. Systematic Biology 61(6):941-954.

See Also

obliterator, byclade,missing.data,crocs.landmarks


Coordinate data for a crocodilian reference skull

Description

Landmark data for the measurements points on a reference crocodilian skull, for use with the obliterator function

Usage

data(crocs.landmarks)

Format

A 6 X m dataframe in which each column gives the start and end points for each cranial measurement in the crocs dataset, from a single reference specimen. 3D Coordinates are listed as x1, x2, y1, y2, z1, z2 in each column.

Source

Brown, C., Arbour, J. and Jackson, D. 2012. Testing of the Effect of Missing Data Estimation and Distribution in Morphometric Multivariate Data Analyses. Systematic Biology 61(6):941-954.

See Also

obliterator, byclade,missing.data,crocs


Landmark data from Guianacara dacrya

Description

Sixteen landmarks taken from the lateral profile of 73 specimens from the Essequibo and rio Branco drainages, used in the description of Guianacara dacrya

Usage

data(dacrya)

Format

A 16 X 2 X 73 array of geometric morphometric coordinates

Source

Arbour, J. and Lopez-Fernandez, H. 2011. Guiancara dacrya, a new species from the rio Branco and Essequibo River drainages of the Guiana Shield (Perciformes: Cichlidae). Neotropical Ichthyology 9:87-96.

See Also

align.missing, MissingGeoMorph


Darter landmarks

Description

A 3D landmark dataset from 30 species of darter fishes (Etheostomatinae; Percidae)

Usage

data("darters")

Format

The format is: List of 6 $ coords : num [1:220, 1:3, 1:30] -1.458 -0.489 -0.037 1.705 0.959 ... ..- attr(*, "dimnames")=List of 3 .. ..$ : NULL .. ..$ : NULL .. ..$ : chr [1:30] "Etheostoma_caeruleum_mtsu5_58mmsl.stl" "Ammocrypta_beanii_ummz242736_43mm.stl" "Ammocrypta_clara_ummz148570_42.23mm.stl" "Crystallaria_asprella_Ummz211889_60mmSL.stl" ... $ land.pairs:'data.frame': 101 obs. of 2 variables: ..$ left : int [1:101] 1 3 5 7 9 11 13 15 17 19 ... ..$ right: int [1:101] 2 4 6 8 10 12 14 16 18 20 ... $ sliders :'data.frame': 32 obs. of 3 variables: ..$ start: int [1:32] 22 23 24 25 26 27 28 29 31 32 ... ..$ slide: int [1:32] 23 24 25 26 27 28 29 30 32 33 ... ..$ end : int [1:32] 24 25 26 27 28 29 30 31 33 34 ... $ surface :'data.frame': 144 obs. of 1 variable: ..$ surface: int [1:144] 60 61 62 63 64 65 66 68 69 70 ... $ restricted: int [1:58] 1 2 3 4 5 6 7 8 9 10 ... $ reference : num [1:11] 22 99 180 15 16 63 176 81 178 11 ...

Details

Includes landmark coordinates (coords), a matrix indicating bilaterally paired landmarks (land.pairs), curve sliders (sliders), surface sliders (surface), rows of head landmarks (restricted) and landmarks approximating the spine/long axis (reference).

Source

Arbour,J.H. In Prep. Get Unbent! R Tools for the removal of arching and bending of fish specimens in geometric morphometric shape analysis

References

Arbour,J.H. In Prep. Get Unbent! R Tools for the removal of arching and bending of fish specimens in geometric morphometric shape analysis

See Also

unbend.spine,bilat.align,unbend.tps.poly

Examples

data(darters)
library(rgl)
plot3d(darters$coords[,,1], aspect=FALSE)

A-priori size regression for missing data estimation

Description

Estimates missing data using regression on a designated size variable. Any values of the size variable missing are estimated with the variable best correlated with size.

Usage

est.reg(x, col_indep)

Arguments

x

A n X m matrix of morphometric data with n specimens and m variables, containing some percentage of missing values input as NA

col_indep

The number of the column in which the independant size variable is stored. This column will be used to estimate missing values in the other columns.

Value

Returns a n X m matrix containing both the original morphometric values as well as estimates for all previously missing values.

Author(s)

J. Arbour and C. Brown

References

Brown, C., Arbour, J. and Jackson, D. 2012. Testing of the Effect of Missing Data Estimation and Distribution in Morphometric Multivariate Data Analyses. Systematic Biology 61(6):941-954.

See Also

best.reg


Reflected Relabelling

Description

This function carries out reflected relabelling to estimate missing geometric morphometric landmarks using bilateral symmetry following Gunz et al 2009.

A set of 3D landmarks are mirrored and aligned with the original data (using procOPA from package "shapes"). Missing landmarks are interpolated from the mirrored specimen.

Usage

flipped(specimen, land.pairs, show.plot = FALSE, axis = 1)

Arguments

specimen

An l X 3 matrix of coordinate data, where l is the number of landmarks. Some data should be missing and designated with NA.

land.pairs

A 2 column matrix, each row should contain row numbers (from matrix specimen) indicating bilateral pairs of landmarks. Unpaired landmarks do not need to be included. See also bilateral symmetry analyses in package "geomorph".

show.plot

Optionally plot the specimen using plot3d from rgl. Estimated landmarks are given in red. Defaults to FALSE.

axis

Which axis should be mirrored across. Default is x (1).

Value

Returns a l X 3 matrix of landmarks.

Author(s)

J. Arbour

References

Gunz P., Mitteroecker P., Neubauer S., Weber G., Bookstein F. 2009. Principles for the virtual reconstruction of hominin crania. Journal of Human Evolution 57:48-62.

See Also

MissingGeoMorph


Calculate the percentage of missing morphometric data

Description

Calculates the percentage of morphometric data points that have been replaced with 'NA' by functions such as missing.data, byclade or obliterator from LOST. Used to verify the amount of missing data inputted into complete morphometric matrices.

Usage

how.many.missing(x)

Arguments

x

A n X m matrix of morphometric data with n specimens and m variables, or a or l X 2(or 3) array of geometric morphometric data containing some percentage of missing data

Value

Returns the percentage (as a decimal) of missing data points present in x

Author(s)

J. Arbour and C. Brown

References

Brown, C., Arbour, J. and Jackson, D. 2012. Testing of the Effect of Missing Data Estimation and Distribution in Morphometric Multivariate Data Analyses. Systematic Biology 61(6):941-954.

See Also

missing.data


Randomly input missing data points

Description

Randomly replaces a set percentage of data points in a matrix of morphometric measurements with NA to simulate missing data. This is function RMD from Brown et al (2012). The amount of missing data can be chosen as an overall percentage of data (simple morphometric data) or specimens and can be constrained to a set of landmarks (for landmarks).

Usage

missing.data(x, remperc, remsp = NULL, land.vec = NULL, land.identity = NULL)

Arguments

x

A n X m matrix of morphometric data with n specimens and m variables. Or an array of geometric morphometrics landmarks (l X m X n)

remperc

The percentage of data to be removed from the matrix or array, expressed as a decimal (ex: 30 percent would be entered as 0.3)

remsp

The percentage of specimens to be removed from the array, expressed as a decimal (ex: 30 percent would be entered as 0.3)

land.vec

The number of landmarks to remove per specimen in an array. This can be a single value or vector with unique or repeating values.

land.identity

A vector to constrain the landmarks to chose from when assigning missing data. The values correspond to row numbers in an array.

Value

Returns a n X m matrix or l X m X n array of morphometric data with missing variables input as NA

Author(s)

J. Arbour and C. Brown

References

Brown, C., Arbour, J. and Jackson, D. 2012. Testing of the Effect of Missing Data Estimation and Distribution in Morphometric Multivariate Data Analyses. Systematic Biology 61(6):941-954.

See Also

byclade,obliterator

Examples

data(dacrya)

#### remove 1 to 6 landmarks from 20% of specimens
dac.miss<-missing.data(dacrya,remsp=0.2,land.vec=c(1,2,3,4,5,6))
dac.miss

Simulate incomplete specimens

Description

Randomly selects a pre-determined number of specimens from a landmark dataset (2D or 3D) and removes some of their landmarks.

Usage

missing.specimens(dataset, nspremove, nldremove, nlandmarks)

Arguments

dataset

A n*l X 2 (or 3) matrix of coordinate data, where n is the number of specimens and l is the number of landmarks. All landmarks from one specimen should be grouped together.

nspremove

The number of specimens which should have landmarks removed.

nldremove

The number of landmarks to remove per specimen. This may be a single value or a vector of values, none of which can be >nlandmarks. If a vector is given, for each specimen selected, the function will randomly select a value from the vector and remove that many landmarks.

nlandmarks

The number of landmarks per specimen

Value

Returns an n * l X 2 (or 3) matrix with some complete and some incomplete specimens.

Author(s)

J. Arbour

References

Arbour, J. and Brown, C. 2014. Incomplete specimens in Geometric Morphometric Analyses. Methods in Ecology and Evolution 5(1):16-26.

See Also

align.missing, MissingGeoMorph


Estimate missing landmark data

Description

This function provides several options for estimating landmark data (details of which can be found in the references below). The function first alignes the landmarks using Procrustes superimposition (align.missing). Both 2D and 3D coordinates can be accommodated.

Usage

MissingGeoMorph(x, method = "BPCA", original.scale = FALSE)

Arguments

x

A n* l X 2 matrix (2D data only) or an l X m X n array (2D or 3D data) of coordinate data, where n is the number of specimens and l is the number of landmarks, and m is the number of dimensions. All landmarks from one specimen should be grouped together. Missing values should be given as NA

method

Four methods are provided for estimating missing landmark data: 1) "BPCA" - Bayesian principal component analysis, 2) "mean" - mean substitution, 3) "reg" - values are estimated based on the most strongly correlated variable available, and 4) "TPS" - thin plate spline interpolation (only available for 2D). See Arbour and Brown (2014) for a comparison of the performance of each of these methods.

original.scale

Rescale and translate the data back to its original size (TRUE) or leave it in the rescaled, superimposed configuration (FALSE)

Value

Returns an n * l X 2 (or 3) matrix of coordinate data, with missing values imputed. Landmarks have been aligned and are given in the original shape space.

Author(s)

J. Arbour

References

Arbour, J. and Brown, C. 2014. Incomplete specimens in Geometric Morphometric Analyses. Methods in Ecology and Evolution 5(1):16-26.

Brown, C., Arbour, J. and Jackson, D. 2012. Testing of the Effect of Missing Data Estimation and Distribution in Morphometric Multivariate Data Analyses. Systematic Biology 61(6):941-954.

See Also

align.missing, missing.specimens


Simulate missing morphometric data with anatomical bias

Description

This function simulates the effect of proximity between measurements in morphometric data on the distribution of missing values. This attempts to replicate specimens showing regional incompleteness. From a morphometric dataset, this function selects a number of specimens to have data points removed from and a number of measurements to remove from each of these specimens based on a random distribution of missing data. For each specimen, this function randomly selects one starting data point for removal. All subsequent data points have a probability of removal that is proportional to the inverse of the distance to all previously removed data points, based on a reference set of landmarks (matrix 'distances'). For a complete mathematical description see Brown et al. (2012). See function obliteratorGM for the geometric morphometric implementation.

Usage

obliterator(x, remperc, landmarks, expo=1)

Arguments

x

A n X m matrix of morphometric data with n specimens and m variables

remperc

The percentage of data to be removed from the matrix, expressed as a decimal (ex: 30 percent would be entered as 0.3)

landmarks

A 6 X m matrix that includes the start and end points (landmarks) for each morphometric measurement from a reference specimen (3D). The data in each column is ordered as x1,x2,y1,y2,z1,z2. See example crocs.landmarks

expo

An optional term for raising the denominator to an exponent, to increase or decrease the severity of the anatomical bias

Value

Returns a n X m matrix of morphometric data with missing variables input as NA

Author(s)

J. Arbour and C. Brown

References

Brown, C., Arbour, J. and Jackson, D. 2012. Testing of the Effect of Missing Data Estimation and Distribution in Morphometric Multivariate Data Analyses. Systematic Biology 61(6):941-954.

See Also

missing.data,byclade,obliteratorGM


Simulate missing geometric morphometric landmarks with anatomical bias

Description

This is the geometric morphometric implementation of the LOST function obliterator. This attempts to replicate specimens showing regional incompleteness. For each specimen, this function randomly selects one starting data point for removal. All subsequent data points have a probability of removal that is proportional to the inverse of the distance to all previously removed data points, based on the shape of that particular specimen (this differs from the linear morphometric implementation which requires a reference set of coordinates). For a complete mathematical description see Brown et al. (2012).

Usage

obliteratorGM(x, remperc, expo=1)

Arguments

x

A n X m matrix of morphometric data with n specimens and m variables. Or a l X 2 or 3 X n array of geometric morphometric coordinates, with l being the number of landmarks.

remperc

The percentage of data to be removed from the matrix, expressed as a decimal (ex: 30 percent would be entered as 0.3)

expo

An optional term for raising the denominator to an exponent, to increase or decrease the severity of the anatomical bias

Value

Returns a n X m matrix of morphometric data with missing variables input as NA

Author(s)

J. Arbour and C. Brown

References

Brown, C., Arbour, J. and Jackson, D. 2012. Testing of the Effect of Missing Data Estimation and Distribution in Morphometric Multivariate Data Analyses. Systematic Biology 61(6):941-954.

See Also

missing.data,byclade,obliterator


Correct for lateral bending in fish geometric morphometric landmarks

Description

Correct for the impact of lateral bending along the spine of a fish in geometric morphometric landmarks. Fits a polynomial function along the length and width of the specimen, determines the perpendicular residuals and arc length along the polynomial and these are used as the new length and width landmarks. Landmarks are first centered and bilaterally aligned using bilat.align.

Usage

unbend.spine(coords, land.pairs, deg = 3, restricted = NULL)

Arguments

coords

A matrix of landmark coordinate data. Columns should be coordinates, and rows landmarks.

land.pairs

A 2-column matrix giving the bilaterally paired landmarks. One column should be all "left" landmarks and one all "right" landmarks.

deg

The degrees for the polynomial function, passed to the function "poly". Typically 2 or 3.

restricted

A limited set of landmarks (row numbers for the coords matrix) to use for bilateral alignment. Typically those representing a rigid/fixed structure (e.g., head). Passed to bilat.align.

Details

Resulting landmark data is in the same scale as the original landmark configuration. Can be applied over multiple specimens using for-loops or apply functions.

Value

bilat.aligned

Provides the bilaterally aligned landmark data as a matrix

unbent

Provides the bilaterally aligned and unbent landmark data as a matrix

Author(s)

J.H. Arbour

References

Arbour,J.H. In Prep. Get Unbent! R Tools for the removal of arching and bending of fish specimens in geometric morphometric shape analysis

See Also

bilat.align, unbend.tps.poly

Examples

data(darters)
library(rgl)
## bilaterally aligned using only head landmarks
lands.unbent<-unbend.spine(darters$coords[,,2],
darters$land.pairs,deg=3, restricted=darters$restricted)$unbent

plot3d(lands.unbent, aspect=FALSE)

TPS-style unbend specimens

Description

Remove dorsoventral arching effect from fish specimen landmark data. Function similar to "unbend specimens" utility in the TPS software suite. Fits a polynomial function along the length and height of the specimen, determines the perpendicular residuals and arc length along the polynomial, and these are used as the new length and width landmarks.

Usage

unbend.tps.poly(coords, reference, axes = NULL, deg = 3)

Arguments

coords

A matrix of landmark coordinate data. Columns should be coordinates, and rows landmarks.

reference

The rows of the matrix over which the polynomial function will be fit. Should represent the spine or other proxy for the long axis of the body.

axes

A vector with 2 values representing the "lateral" view of the fish. The first entry should be the "long" (anterior-posterior) axis and the second should be the vertical (dorso-central) axis.

deg

The degrees for the polynomial function, passed to "poly". Typically 2 or 3 (default = 3).

Details

It is advisable to remove lateral bending with unbend.spine prior to using this function. Otherwise data should be at least bilaterally aligned to a plane (and seebilat.align) Resulting landmark data is in the same scale as the original landmark configuration. Can be applied over multiple specimens using for-loops or apply functions.

Value

Returns a matrix of landmark data with the effect of dorso-ventral arching removed.

Author(s)

J.H. Arbour

References

Arbour,J.H. In Prep. Get Unbent! R Tools for the removal of arching and bending of fish specimens in geometric morphometric shape analysis

See Also

bilat.align, unbend.spine

Examples

library(rgl)
data(darters)
## bilaterally aligned using only head landmarks
lands.unbent<-unbend.spine(darters$coords[,,3],
darters$land.pairs,deg=3, restricted=darters$restricted)$unbent

plot(lands.unbent[,c(1,3)],asp=1)

lands.unbent<-unbend.tps.poly(lands.unbent,darters$reference,axes=c(1,3))
plot(lands.unbent[,c(1,2)],asp=1)

plot3d(lands.unbent, aspect=FALSE)