**Discovering Craniofacial Patterns Using Multivariate Cephalometric Data for Treatment Decision Making in Orthodontics**

**Descubriendo Patrones Craneofaciales Usando Datos Cefalométricos ****Multivariados para la Toma de Decisiones en Ortodoncia**

**Pamela Araya-Díaz*; Gonzalo A. Ruz** & Hernán M. Palomino*,*****

* Departamento del Niño y Adolescente, Área de Ortodoncia, Facultad de Odontología, Universidad Andrés Bello, Santiago, Chile.

** Facultad de Ingeniería y Ciencias, Universidad Adolfo Ibáñez, Santiago, Chile.

]]> *** Departamento del Niño y Ortopedia Dentomaxilar, Facultad de Odontología, Universidad de Chile, Santiago, Chile.**SUMMARY**: The aim was to find craniofacial morphology patterns in a multivariate cephalometric database using a clustering technique. Cephalometric analysis was performed in a sample of 100 teleradiographs collected from Chilean orthodontic patients. Thirty cephalometric measurements were taken from commonly used analysis. The computed variables were used to perform a clustering analysis with the k-means algorithm to identify patterns of craniofacial morphology. The J48 decision tree was used to analyze each cluster, and the ANOVA test to determine the statistical differences between the clusters. Four clusters were found that had significant differences (P<0.001) in 24 of the 30 variables studied, suggesting that they represent different patterns of craniofacial form. Using the decision tree, 8 of the 30 variables appeared to be relevant for describing the clusters. The clustering analysis is effective in identifying different craniofacial patterns based on a multivariate database. The distinct clusters appear to be caused by differences in the compensation process of the facial structure responding to a genetically determined cranial and mandible form. The proposed method can be applied to several databases, creating specific classifications for each one of them.

**KEY WORDS: Craniofacial patterns; Morphological patterns; Clustering technique; Orthodontics.**

**RESUMEN**: El objetivo fue encontrar patrones morfológicos craneofaciales, a partir de una base de datos cefalométricos multivariada, utilizando una técnica de clustering. Se realizó un análisis cefalométrico a una muestra de 100 telerradiografías pertenecientes a pacientes chilenos de ortodoncia. Treinta medidas cefalométricas obtenidas de los análisis más utilizados fueron registradas. Las variables computadas se utilizaron para realizar un análisis de clustering con el algoritmo k-medias, para identificar patrones de morfología craneofacial. El árbol de decisión J48 se utilizó para analizar cada cluster, y test de ANOVA para determinar diferencias estadísticamente significativas entre los clusters. Se encontraron cuatro clusters con diferencia estadísticamente significativas (p<0,001) en 24 de las 30 variables estudiadas, lo que sugiere que efectivamente corresponden a diferentes patrones craneofaciales. Utilizando el árbol de decisión, se pudo determinar que 8 de las 30 variables resultaron ser relevantes en la definición de los clusters. El análisis de clustering es efectivo en identificar patrones morfológicos craneofaciales usando una base de datos multivariada. Los distintos cluster encontrados, aparentemente se formarían a partir de diferencias en el proceso de compensación de la estructura facial, en respuesta a la forma mandibular genéticamente determinada. El método propuesto puede ser aplicado a múltiples bases de datos, creando clasificaciones específicas para cada una de ellas.

**PALABRAS CLAVE: Patrones craneofaciales; Patrones morfológicos; Técnica de clustering; Ortodoncia.**

**INTRODUCTION**

A child's face is not a miniature version of an adult's face; in fact, it does not have the same proportions between the different parts and regions that constitute a face. The progressive facial growth is a differential growth process, meaning that some parts or regions grow in size earlier or later than others, and they do it in different ranges, directions, and magnitudes. Therefore, the growing process of the face involves a sequence of maturing steps, which finally establishes the dimensions, angles, and proportions of an adult face (McCarthy, 1990).

There is a "normal" craniofacial growth, in which a down and forward general growth direction is maintained, however, the development of craniofacial components can occur at different times, rates and with predominance of one or another determined direction (McCarthy).

]]> This leads to variations in the morphology of the maxillofacial complex, which can be classified into determined facial types, patterns or biotypes (Chvatal et al., 2005). These differences are observed in the way the face is related to the cranium, in the general contour of the profile and in the relationships of the teeth to the skeletal components of the face (Downs, 1949).The facial patterns, have been extensively studied (Ricketts, 1957; Koch & Bartsch, 1996; Riesmeijer et al., 2004; Ilayperuma, 2011; Bedoya et al., 2012) in order to facilitate the diagnosis and the design of better treatment plans in many disciplines of the medical sciences related within the form and facial improvement as plastic surgery, maxillofacial surgery and orthodontics (Chvatal et al.).

In orthodontics, facial patterns are relevant, because the variations of each individual should be considered, and each case treated according to its own requirements (Downs), in this context, many studies have been in favour of a more individualized treatment approach according to the individual skeletal pattern (Kim et al., 2005; Bedoya et al.).

Therefore, the identification of facial patterns could be used as a basis for developing growth predictors. This is especially important in preadolescents and adolescents patients, which are experimenting significant changes due to the growth process in their occlusion, facial skeleton, and profile (Bishara, 2000), because the direction and magnitude of the growth can act positively in the correction of a malocclusion, helping with the mechanics of the treatment, or if these changes are not considered, there is a risk that the treatment will have no effect or even worse, increase the patient´s anomaly (Gregoret, 2002). Also, it is not possible to know where to position the teeth unless it is known where the bony bases will be during the growth process, and at the end of the treatment. A growth prediction is important not only in the treatment planning and the treatment provision, but it is equally important in the evaluation of prognosis during retention and post-retention (Kocadereli & Telli, 1999).

The most used and versatile technique in the investigation of the craniofacial skeleton is the cephalometry, name that is given to the morphological study of all the structures present in a human head (Vojdani et al., 2009). This technique consist in the identification of anatomical reference points (landmarks), on a lateral cranium radiograph (lateral teleradiography); and the algebraic measurement of lines, angles and ratios, obtained from traces connecting the landmarks (McIntyre & Mossey, 2003).

Based on cephalometric analysis, several pattern classifications have been described (Downs; Ricketts; Sassouni, 1969; Schudy, 1964), but most of them use a limited number of variables, the variables are chosen arbitrarily or have been analyzed separately, anteriorly, or vertically.

In this aspect, a skeletal pattern classification, made using sagittal and vertical morphological data considering a wide number of variables, can be useful in clinical practice for better diagnosis and treatment planning.

The recognition of patterns, inherent in a population, can be carried out by using unsupervised machine learning techniques. These techniques are capable of grouping individuals or cases from a data set, based on the similarities or differences between their attributes, recognizing subgroups, classes or patterns. This process is also known as clustering; the two terms are often used interchangeably (Härdle & Simar, 2003).

The objective of this work is, given a cephalometric data set, to present a new form of determining a facial biotype classification, using unsupervised machine learning and determining which of the variables are relevant for classifying the data, as well as, to explore the characteristics of the patterns found.

In a second stage the patterns found could be used as class labels for prediction, thus, helping in the decision making of the treatment plan in growing patients.

]]>**MATERIAL AND METHOD**

Cephalometric Data. A sample of 100 lateral teleradiographies was collected from Chilean patients, 47 males (mean age: 14.6 ± 1.87 years old) and 53 females (mean age: 14.3 ± 2.42 years old). These patients were randomly selected from the files of two orthodontic´s clinics in Santiago, Chile. Patients that were treated with orthopedic therapies like protraction facemask, functional appliances, and extra oral forces were excluded from the sample because those therapeutic appliances could modify the craniofacial morphology. Then, reference points were marked (Figure 1) and a calibrated examiner drew cephalometric traces manually. The measures (30 variables) were taken from commonly used cephalometric analysis (Table I) (Zamora & Duarte, 2003).

In order to evaluate the intraexaminer error, cephalograms of 10 patients were re-measured after 15 days, and the intraclass correlation coefficient (ICC) was calculated.

**Clustering**. The computed attributes were used to perform a clustering analysis, with the k-means algorithm, in order to identify the different patterns with the data mining software Weka (Witten & Frank, 2005).

The clustering algorithm used was k-means (McQueen, 1967), which is a method for finding the natural groupings of a data set. The k-means algorithm is shown in Figure 2, where dist is the squared Euclidean distance.

One of the limitations of most clustering techniques, including the k-means, is that the user must specify the number of clusters (k) before the algorithm is applied. Therefore, human knowledge must be included in the clustering process, conditioning the results. To reduce this problem, several methods for selecting automatically the most plausible number of clusters have been developed. In this paper, we selected k by computing a cluster validity index proposed in Bezdek & Pal (1998). To compute this measure, clusters were formed using k=2,,10. This procedure was repeated 30 times, and then the simulation that obtained the highest cluster validity index value was selected.

The analysis of Shapiro-Wilk was performed to evaluate the distribution of each variable, and ANOVA analysis of variance, followed by Tukey post-hoc test were conducted to verify differences between the measurement. The level of significance adopted was p0.001.

To visualize the clusters formed, principal component analysis (PCA) was used to project the 30-dimensional data points onto a 2-dimensional plane. Also, to explore which of the 30 variables are relevant for classifying a data example into one of the clusters found, the J48 decision tree classifier (Weka implementation of C4.5) (Quinlan, 1993) was used.

Fig. 1. Landmarks used for the cephalometric analysis.

]]> Table I. Cephalometric measurements (variables) recorded for the present study.Fig. 2. The k-means algorithm.

**RESULTS**

The ICC obtained was 0.87 considered almost perfect according to Mandeville (2005). The results of the best cluster validity index for k=2,,10, is shown in Figure 3. The higher the value of the index, the more plausible is k. The most plausible value is k=2, but this result limits our analysis to two biotypes, which are general and not very specific. So for this study, the next most plausible value was used, k=4.

The four clusters are shown in Figure 4 using a PCA projection. When k=2 is considered, clusters 2, 3, and 4 merge to form one cluster.

Cluster 1 is formed by 23 individuals (mean age of 15.13±1.83), cluster 2 by 27 individuals (mean age of 14.14±1.67), cluster 3 by 29 individuals (mean age of 14.2±2.18), and cluster 4 by 21 individuals (mean age of 14.47±2.18).

The four clusters are statistically significantly different in 24 of the 30 (P<0.001) variables between them, suggesting that the groups formed represent actually different patterns of craniofacial form.

]]> Tukey post-hoc test, show the differences between pairs of clusters: cluster 1 is statistically different in 19 variables from cluster 2, 14 variables from cluster 3 and 15 from cluster 4. Cluster 2 is statistically different in 10 variables from clusters 3 and 4, and finally, cluster 3 and 4 are statistically different in 13 variables.The age of the subjects (not included into the cluster analysis) was similar between the groups formed, discarding that the differences in the volume of the structures, due to the possible variation of the growth stage of the patients studied, could be a factor that affects the clustering results.

The variables that turned out to be relevant in each cluster, using the J48 classifier, were 8: mandibular length (Co-Gn), anterior facial height (N-Me), articular angle (S-Ar-Go), anterior inferior height (ENA-Me), SNA, Pts-M, sella angle (N-S-Ar), and SNB. With these variables, the J48 classifier was capable of correctly classifying 93 out of the 100 patients into their respective clusters.

The values of the relevant variables are shown in Table II, and the classification rules obtained by the J48 classifier are shown in Figure 5.

Fig. 3. Cluster validity index value (the higher the better) for different values of k.

Fig. 4. Visualization of the clusters using principal component analysis.

]]>Fig. 5. Result of the J48 decision tree classifier applied to the cephalometric data.

Table II. Values of relevant variables, age and vert index for each cluster.

**DISCUSSION**

The ANB angle, that is extensively used to classify patients into skeletal classes, is not significantly different between the four clusters found. Nevertheless, it is important to clarify that, in this sample, there were no patients with negative ANB, given that the sample was taken randomly, so, this classification would be valid only for patients with ANB0.

Analyzing the variables that resulted to be relevant for each cluster, and other variables with significant differences, we can describe the four clusters formed searching for the singularity of each one: Cluster 1, is the most different from the rest of the clusters, as we can observe in the visualization of the patients distribution (Figure 4). They have the largest mandible (inferior maxillary) and the largest and most advanced maxilla. However, the mean of the group maintains an acceptable sagittal relationship between both maxillae (ANB 4).

Patients in cluster 2, instead, have the less posterior height and most retruded mandible, and the smaller maxilla. However, the mean of the group, also maintains an acceptable sagittal relationship between both maxillae (ANB 4.9).

In cluster 3, these individuals present a large mandible, and they could have a bigger maxilla or a normal maxilla with a more retruded mandible. So, the patients in both cases maintain an acceptable sagittal relationship between both maxillae (ANB 3.6).

In cluster 4, a tendency to obtain a sagittal harmony also was observed (ANB 4.4), so that, having all a regular sized mandible, they could have a normal positioned and proportionally sized maxilla or a more anterior maxilla accompanied with a minor inferior facial height.

]]> Analyzing the variables used by J48 for the classification, there is, in the first place, the mandibular length (Co-Gn), appearing to be the principal variable that divides individuals in two main groups. The anterior facial height (ENA-Me) plays also an important role for the classification. Some variables, in the total sample, appear to be negatively correlated: a minor total length of the mandible is often accompanied with a bigger mandibular base, a bigger anterior facial height, with a lower articular angle, etc.It is observed, just like Enlow (1996) described, that multiple compensations occur during growing and human evolution, and these compensations, as observed in the present study, occur following some rules, resulting in patterns of compensations: some individuals would differ on the sagittal position of maxillary to compensate a minor or greater mandibular length and achieve a proximity to a normal sagittal relationship, while others would differ on the facial height, on the mandible form or in the angulations of the cranial base.

As we can see, the mandibular length is the first relevant variable, and the reason for this, could be found in the strong genetic influence of its growth behavior which, along with the cranial structures would conform the scaffold to the morphology development of the rest of the face. Therefore, we can suggest that the craniofacial structures offset the mandibular and cranial genetically predetermined morphology.

As a final discussion, it can be said that the different patterns found in this study, are the results of how the entire craniofacial complex compensates the dominant genetic influence of the mandible, in order to achieve facial harmony, and this information could be used to elaborate more accurate diagnosis and individualized treatment plans.

**CONCLUSION**

The clustering analysis is effective in identifying different craniofacial patterns based on a multivariate database. The distinct clusters appear to be caused by differences in the compensation process of the facial structure responding to a genetically determined cranial and mandible form.

The proposed method can be applied to several databases, creating specific classifications of craniofacial morphology adapted to different populations. Also, this method allows the discovering of patterns in an unsupervised manner, which differs from previously described ones. This is an advantage, considering that this is carried out without referencing to any standard values.

Future research will consider an automatic classification stage, which will classify new patients into one of the four facial biotype patterns found in this work. Also, each cluster should have associated a treatment plan, which should take into account the characteristics of their biotype patterns, thus, obtaining better results at the end of the treatment period.

Finally, more research is needed to validate our results and determine the possible applications of the craniofacial biotypes found as part of the treatment plan process.

**REFERENCES**

Bedoya, A.; Osorio, J. C. & Tamayo, J. A. Facial Biotype in Three Colombian Ethnic Groups: A New Classification by Facial Index. Int. J. Morphol., 30(2):677-82, 2012. [ Links ]

Bezdek, J. C. & Pal, N. R. Some new indexes of cluster validity. IEEE Trans. Syst. Man Cybern., 28(3):301-15, 1998. [ Links ]

Bishara, S. E. Facial and dental changes in adolescents and their clinical implications. Angle Orthod., 70(6):471-83, 2000. [ Links ]

Chvatal, B. A.; Behrents, R. G.; Ceen, R. F. & Buschang, P. H. Development and testing of multilevel models for longitudinal craniofacial growth prediction. Am. J. Orthod. Dentofacial Orthop., 128(1):45-56, 2005. [ Links ]

Downs, W. B. Variations In Facial Relationship: Their Significance In Treatment and Prognosis. Angle Orthod., 19(3):145-55, 1949. [ Links ]

]]>Enlow, D. H. & Hans, M. G. Essentials of facial growth. Philadelphia, Saunders W.B. Co. Ltd., 1996. [ Links ]

Gregoret, J. Orthodontics and Surgery: Diagnosis and Planning. Barcelona, Espaxs, 2002. [ Links ]

Härdle, W. & Simar, L. Applied multivariate statistical analysis. Berlin-Heidelberg-New York, Springer Verlag, 2003. [ Links ]

Ilayperuma, I. Evaluation of Cephalic Indices: a clue for racial and sex diversity. Int. J. Morphol., 29(1):112-7, 2011. [ Links ]

Kim, J. Y.; Lee, S. J.; Kim, T. W.; Nahm, D. S. & Chang, Y. I. Classification of the Skeletal Variation in Normal Occlusion. Angle Orthod., 75(3):3119, 2005. [ Links ]

]]>Kocadereli, I. & Telli, A. E. Evaluation of Ricketts' long-range growth prediction in Turkish children. Am. J. Orthod. Dentofacial Orthop., 115(5):515-20, 1999. [ Links ]

Koch, R. & Bartsch, A. Pattern and prediction of orthodontic treatment course. Eur. J. Orthod., 18(6):645-54, 1996. [ Links ]

MacQueen, J. Some methods for classification and analysis of multivariate observations. In: Proc. of the fifth Berkeley Symposium on Mathematical Statistics and Probability. Berkeley, University of California Press, 1967. [ Links ]

McCarthy, J. Plastic Surgery: Cleft lip and palate, and craniofacial anomalies. V. 4. Philadelphia, W. B. Saunders Company, 1990. [ Links ]

Mclntyre, G. T. & Mossey, P. A. Size and shape measurement in contemporary cephalometrics. Eur. J. Orthod., 25(3):231-42, 2003. [ Links ]

]]>Mandeville, P. B. Tema 9: coeficiente de correlación intraclase (ICC). Ciencia UANL, 8(3):414-6, 2005. [ Links ]

Quinlan, J. R. C4.5: programs for machine learning. San Francisco, Morgan Kaufmann, 1993. [ Links ]

Ricketts, R. Planning treatment on the basis of the facial pattern an estimate of its growth. Angle Orthod., 27(1):14-37, 1957. [ Links ]

Riesmeijer, A. M.; Prahl-Andersen, B.; Mascarenhas, A. K.; Joo, B. H. & Vig, K. W. A comparison of craniofacial Class I and

Class II growth patterns. Am. J. Orthod. Dentofacial Orthop., 125(4):463-71, 2004.

Sassouni, V. A classification of skeletal facial types. Am. J. Orthod., 55(2):109-23, 1969. [ Links ]

]]>Schudy, F. F. Vertical growth versus anteroposterior growth as related to function and treatment. Angle Orthod., 34(2):75-93, 1964. [ Links ]

Vojdani, Z.; Bahmanpour, S.; Momeni, S.; Vasaghi, A.; Yazdizadeh, A.; Karamifar, A.; Najafijar, A.; Setoodehmaram, S. & Mokhtar, A. Cephalometry in 14-18 Year Old Girls and Boys of Shiraz-Iran High School. Int. J. Morphol., 27(1):101-4, 2009. [ Links ]

Witten, I. H. & Frank, E. Data Mining: Practical machine learning tools and techniques. 2nd ed. San Francisco, Morgan Kaufmann, 2005. [ Links ]

Zamora, C. & Duarte, S. Atlas de cefalometría: análisis clínico y práctico. Caracas, Amolca, 2003. [ Links ]

Dirección para Correspondencia:

**Pamela A. Araya-Díaz**

Área de Ortodoncia

Facultad de Odontología

Universidad Andrés Bello

Calle Echaurren 237

Santiago

CHILE

Email: pam.araya@uandresbello.edu

Received: 25-05-2013

Accepted: 27-07-2013