SIGN LANGUAGE FORMAL DESCRIPTION AND SYNTHESIS

Special needs of deaf people appeal henceforward to sign language synthesis. The system presented here is based on a hierarchical description of sign, trying to take the different grammatical processes into account. Stress is laid on hand configurations specification thanks to finger shapes primitives and hand global properties, and on location and orientation computation issues. We then expose the results achieved from the corresponding written form of signs, leading to their computer virtual animation.


Introduction
Sign language appears to be the main means of communication within the deaf community. In spite of a repression cycle it has suffered from in several countries until recent past, it remains a living language, in the full acceptance of the term, with communication capacities similar to those of oral languages.
Our project of synthesizing sign language is based on the established facts that half of deaf people encounter difficulties in reading and, as a result, suffer from subeducation. Being able to synthesize their mother tongue would undoubtedly be of major interest for those persons. In education as well as in daily life -such as in an emergency -the generation process of synthetic signs offers many advantanges in comparison with video : facility and swiftness of generation and transmission, availability of the virtual signer and smoothness of transitions between lexical items.
The first three-dimensional synthesis of sign has been achieved in the early eighties (Shantz and Poizner, 1982). But it suffered from the lack of power of computers at the time, which entailed low-level (joint angles) specification of body postures. Surprisingly enough, very few attempts have been made since then to carry out the same goal.
This study is in keeping with the global aim of translating written French into signs. But the starting point considered lies in a textual representation of the signed sentence which is to be animated; as a consequence, we will not deal with sign language syntax here. The main core of the system is a sign formal description, grounded on a set of primitives isolated beforehand, stemming from both linguistic works and more synthesis-oriented research.

GRAMMATICAL BASES AND MODULATION PROCESSES
Following Stokoe's footsteps, linguists have proved that sign languages, like oral ones, were doubly articulated into phonemic and morphemic (or monemic) levels. This decomposition has definitely given evidence for deaf people's signs system to be counted among all languages in the world. Second-level articulation units have been identified on the minimal pairs criterion (Nève, 1997), i.e. two signs differ from one another in only one of these cheremes. Belonging to one of four spatial types -hand configuration, manual location and orientation, and movement -, they combine to form morphemes, lexemes (signs) and whole statements; their psycholinguistic actuality has been besides fully validated.
Sign language has its own grammar, based on spatial, physiological and temporal dimensions. For instance, strongly iconic classifiers and size-and-shape specifiers, but also pronouns and index references, appeal to mechanisms involving the signer's space. Non-manual expressions both play paralinguistic (kinetic stress, face expression) and grammatical roles (marking syntactic clauses). At last, it has been shown that special modulation processes (Klima and Bellugi, 1979;Namir and Schlesinger, 1978) affecting repetition, shape, amplitude and kinematics of movement (Loomis et al, 1983), were used to express subtle variations in meaning. Liddell and Johnson (1989) have proposed a highly detailed description of sign based on the partition between segmental (holds and movements) tier and articulatory bundle tier (containing features of the hand). One of the major interest of this approach is to tackle with particular phonological (hold deletion, assimilation) but also morphological processes of sign language, including the one -of fundamental importance -of subject and object agreement.

A SIGN LANGUAGE FORMAL DESCRIPTION
The proposed sign language formal description system tries to take the widest range of such processes into account. To achieve that purpose, it has been split into two levels. Sentence, on the one hand, has its own parameters (localization and indexic references, grammatical clause descriptor, …). On the other hand, signs description may inherit from some of those discourse parameters. We will focus here on sign specification. In order to provide the most general possible synthesis, with abilities to describe not only French-but every Sign Language, one of the basic underlying principles was to identify primitives at all levels, aimed at gradual combination into more and more complex structures (see figure 1).
In that way, a hand configuration is described in terms of digital primitives and global hand properties (see details in section 3). Together with orientation, location and an optional manual point, it makes up what we call hand specification. Specifying a contact point is convenient in those many cases where the latter, rather than the wrist, must be located at the given position. Movement, as for it, is composed of one main move, with specific path and dynamics. Optional preceding and following holds, as well as some superimposed secondary movement (waving of fingers for instance), may complete movement specification.
A shift primitive is built as a bundle of hand specifications at the beginning and end of the sign, and of the movement itself. A transitional hand specification may be added too if necessary. Depending on the number and activity of articulators implied in a sign, a shift can be described -either as a single shift primitive (if the sole strong hand is present), -or a (dominant hand) shift primitive plus a hand specification for the weak hand, -or two shift primitives if both hands are active. The description is completed with a spatial relationship and a descriptor for synchronization between the two movements in this case (see table 1). We introduce the notion of macro-shift in order to be able to describe quite complex signs (such as compounds). This is made up of one or more shifts, optionally followed  (Moody, 1986). At last, we have endowed our sign description with grammatical and non-manual information. Facial expression is of particularly salient importance when considering grammatical parameters inherited from the sentence level.

Hand Configurations
Hand configurations in sign languages have given rise to several coding systems (Kurokawa, 1992;Lee, 1994). Among them, HamNoSys (Prillwitz and Zienert, 1990) seems to be able to encode the widest diversity of hand shapes. We have set up a new descriptive method, based on the examination of different inventories, and split into three levels : finger shape primitives, hand global properties and constraints on joints.

FINGER CONFIGURATION PRIMITIVES
All fingers except thumb have the same behaviour in terms of movement : they may flex at their interphalangeal joints, and both flexion and abduction (spreading of fingers) are permitted at the metacarpophalangeal (MCP) joint. Thumb has a larger reachable workspace due to greater articulatory complexity. Especially, it may contact other fingers in various ways. In those cases, its shape will be considered here as constrained in a hand global property (primitives specifying only free thumb configurations). We have isolated seven finger shape primitives, and five ones for the thumb, as being used in sign languages. For instance, the configuration called '#Flat' (see figure 2) has flexed MCP joint and extended proximal and distal interphalangeal joints. The very peculiar '#E' configuration has been taken into consideration at the present level, although fingers interaction is involved. As a matter of fact, it almost merely appears in the corresponding manual alphabet entry.

HAND PROPERTIES AND CONSTRAINTS ON JOINTS
Hand properties imply more than one digit. Thumb and finger contact (or opposition), and abduction or crossing of fingers are here taken into account. Many kinds of such relationships were found in sign languages: • Contact, or opposition without contact, between thumb and another finger : -Ventral contact : thumb tip is in touch with the middle phalanx of the finger in question, that may have any configuration (#Hook in most cases). -Flat contact between pads of thumb and of another finger, which should have #Flat configuration. -Contact between tips of thumb and of another finger, which should have #O configuration. • Crossing of index and middle fingers. Other kinds of crossing are far more constraining for joints and absent from sign language. • Intercalation of thumb, mainly between index and middle fingers. Thumb may also functionally come between middle and ring fingers, or between ring and middle fingers; nevertheless, those situations are seldom used in sign language. • Dorsal contact : thumb pad covers a finger on the back of its middle phalanx. • Abduction / adduction is also considered here, insofar as it generally involves several fingers, if not the whole hand. Figure 2 above shows an example of a flat contact between the thumb and index finger. Final step for hand configurations specification is the application of constraints on joints. The metacarpophalangeal structure of ligaments implies indeed more or less strong interdependence between flexions of neighbouring fingers at that joint. Inequalities proposed in (Lee and Kunii, 1993) have been applied on MCP joints as flexion limits, in order to achieve realistic hand shapes synthesis. Moreover, in a clenched hand, fingers converge towards the scaphoid point. That phenomenon has been included in the synthesis process by setting artificial abduction, proportional to flexion, when the latter exceeds two third of its static maximal value.

HAND CONFIGURATION SYNTHESIS
The kinematic skeleton of the thumb, as for the whole body, is mathematically modeled by ideal joints and flat segments, each one being defined in the local coordinate system attached to the proximal joint. Axes are chosen such that x is the main axis of each segment, oriented from the proximal to the distal joint, and that (xz) is the plane in which segment points are given (palm for instance in figure 3), y being defined by the right-hand rule. Rotations about those axes will be considered in the following order : adduction-abduction (yaw) θ about the y-axis, then flexionextension (pitch) ϕ about the x-axis, and finally axial rotation (roll) ψ about the z-axis. THE INTERNATIONAL JOURNAL OF VIRTUAL REALITY Vol. 3,No. 4 Figure 3: Kinematic model of the thumb.
The major issue in hand configurations synthesis is thumb posture computation. As called to mind above, this articulator has more degrees of freedom than other fingers. Several methods have been described to position such articulated structures, even when multiple goals are to be achieved (Badler et al., 1987). We have rather rationally tried to simplify our initial model so as to obtain a nonredundant system and solve it. Axial rotation takes place at the carpometacarpal (CMC) joint, especially during opposition movement. This one will here be undertaken by a fixed initial rotation, about the axis of the first metacarpal (see figure 3). Neglecting the low adduction-abduction angle, only flexion will be considered at the MCP joint, which is moreover assumed not to vary in a wide range of values around ten degrees. Thumb tip may therefore be expressed in the CMC local coordinate system as where c means cosine, s means sine, and p 1 , p 2 , p 3 are segment lengths, from the proximal to the distal one.
Solving of those inverse kinematics equations is achieved thanks to a method presented in (Kobrinski and Kobrinski, 1989), assuming that the distance from x T to the target position x expresses as a function of only one of the unknowns, others remaining fixed. Let α i be the i-th rotation angle and let us determine its optimal value α i * . Rewriting x T as where vectors a α , b α , and d α do not depend upon α i , the

LOCATION
Ranges of locations in sign language are limited by the reachable workspace of the hand (Lenarcic and Umek, 1994) : almost always above the waist, ahead of the front plane, roughly within half spheres placed in front of the torso, in front and both sides of the head. In our system, hand may be assigned a location in two ways: • in space, with main points on a grid defined as intersections of three planes (horizontal, parallel to the frontal plane, parallel to the sagittal plane); • on the signer's body (including the weak hand), at various locations inspired by linguistic studies (Liddel and Johnson, 1989). • Points on the body can be expressed in terms of spatial coordinates in a 3D synthesis. Therefore, what we have to do here is, given a target location for the wrist, compute the four angles of the arm (three at the shoulder -θ s , ϕ s , ψ s -and one at the elbow -ϕ e ). The above stratagem keeps us out of generating unnatural configurations for the arm, while avoiding greedy algorithms of inverse kinematics.

ORIENTATION
According to linguistic studies, palm orientation seems to be sufficient to encode hand orientation. Six main directions only (up, down, front, back, ipsilateral and contralateral) are generally isolated as a result of phonological substitution. We cannot release from specifying hand orientation more precisely in our synthesis perspective. This is done by defining both a forward vector ( n ) normal to the palm, and a vector ( i ) oriented according to the (virtually) pointing index finger, or to the axis of the third metacarpal, from CMC3 to MCP3. The problem of hand global orientation consists in computing: • the radio-ulnar pronation-supination angle ψ e , • two angles at the wrist joint : adduction-abduction θ w and flexion-extension ϕ w .
In mathematical terms, let ( ) a n s A R R = ′ 2 0 be the current orientation of the local coordinate system at the elbow, after shoulder rotations and elbow flexion have been applied. The required global orientation of the hand thus expresses as: . It often appears easier and convenient to specify a particular point M on the hand that is to be settled at the given place, rather than setting location of the wrist W. In sign I-give-you for example (see section 5), the index finger tip should be located on the upper torso at the beginning of the movement. Since coordinates of M are known in the local coordinate system 3 R ′ ′ attached to the palm, the wrist location can be found easily as In the last place, specifying relative orientation has been contemplated by using a set of predefined values for each angle ψ e , θ w , and ϕ w . In a number of cases besides, only pronation-supination of the forearm is required to describe the orientation of the hand. For the moment, this method is solely used when the wrist joint -no hand pointis concerned by the target location.

SYNOPSIS
To specify any sign according to the structure presented in section 2, we have prefered a textual extensive description of the sign, rather than a symbol-based coding system (on grounds of legibility and easiness of data exchange). The sign compiler implements a lexical analyser, a parser, and the evaluator itself. It builds the hierarchic sign specification structure in memory from predefined entities like hand configurations, while inheriting from grammatical information from the sentence level through sign parameters (see figure 6).
The graphical synthesis module generates the signs after each one has been previously evaluated in this way. The underlying model of the human body is a hierarchic tree of upper body segments and joints, from the torso to the eyebrows and distal phalanges. Each segment is described by a collection of 2D-points and has one or more children segments together with the proximal joint attached (including the local and global coordinate systems). Moreover, segments are regrouped within macro-segments (fingers, hands, head, ...) accepting high-level messages.
Input sentences look as streams of sign-words and escape codes handling grammatical features such as indexic references and role play, time setting, clause type (condition, wh-question, yes/no question, imperative), grammatical repetition, etc. Signs are described in a formal language with special syntax close to the Smalltalk object-oriented programming language.

RESULTS
In the current progress of our work, the system computes arm postures as well as hand configuration and orientation from the given sign textual description. Symbols used to specify hand shapes, contact points and spatial locations and orientations, are evaluated so as to transmit suitable objects to the human body structure. The graphical synthesis has been achieved by means of connected deformable polygons, with distance-dependent lighting and removal of hidden surfaces. Primitives of facial expressions have been added too, as part of lexical items, but conveying also crucial grammatical information.
Two parameters have been inserted in the sign give above to take the subject and object agreement into account : thanks to local variables, we are able to determine the sign features with the different possible values (I/me, you and he/him) for the agent and patient parameters. The syntax used leads to a very readable description of the sign, in which the formal hierarchic sign specification clearly appears. It is built and referenced to by sending messages to objects. A graphical interface is provided with the editor in order to make sign edition easier. So far, it allows the user to add hand specifications with direct tridimensional visualization of the selected elements.
For the moment, movement remains rather simple: 'natural' shifts (through joint angles extrapolation from one hand specification to another) and the main types (straight line, circle, arc) have been characterized, and are easily specified by the user. More complex movements may be then considered as series of goals (Lebourque and Gibet, 1997), and synchronization between the two articulators be handled by event-driven processes with Grafcet or Petri-net representations. Besides, we have recently developped a specific application intended for the analysis of sign duration and dynamics from video sequences. This should enable us to specify tense of signs better, just as pauses between signs at the sentence level.

Conclusion
One of the guidelines of the system described here was to be able to generate any kind of sign, as far as possible. The proposed formal description is generic enough to take up such a challenge. Not only does it provide primitives and more complex structures found in every sign, but it also tries to take the widest variety of sign language grammatical processes into consideration. Such an open system lets the user entirely free to adjust the encoding sharpness of signs. Movement must still be specified precisely in terms of path dimension and dynamics, together with hands arrangement and synchronization. But the virtual signer already shows interesting results for simple signs, especially a high-level specification of hand configuration, location and orientation. In collaboration with native deaf signers, we now intend to test the generated synthetic signs in order to optimize their expressive potential.