# HG changeset patch # User Eli Zaretskii # Date 1262357847 18000 # Node ID b1e1b45c9fb662b6d38c1d5167e8cc37842f5215 # Parent 9e8415b885eeece645e9d5715d812ccd99229e14 Retrospective commit from 2009-1219. Fix reordering of Arabic text in etc/HELLO. Extend .gdbinit commands to support bidirectional display. buffer.c (Fbuffer_swap_text): Swap the values of bidi_display_reordering and bidi_paragraph_direction. bidi.c (bidi_resolve_weak): Fix nesting of conditions for Wn processing. Move W3 after W1 and W2. Simplify W4 because it is now always after W1. .gdbinit (pbiditype): New command. (pgx): Use it to display bidi level and type of the glyph. diff -r 9e8415b885ee -r b1e1b45c9fb6 src/.gdbinit --- a/src/.gdbinit Fri Jan 01 09:46:25 2010 -0500 +++ b/src/.gdbinit Fri Jan 01 09:57:27 2010 -0500 @@ -447,6 +447,33 @@ Pretty print window structure w. end +define pbiditype + if ($arg0 == 1) + printf "L" + end + if ($arg0 == 2) + printf "R" + end + if ($arg0 == 3) + printf "EN" + end + if ($arg0 == 4) + printf "AN" + end + if ($arg0 == 5) + printf "BN" + end + if ($arg0 == 6) + printf "B" + end + if ($arg0 < 1 || $arg0 > 6) + printf "%d??", $arg0 + end +end +document pbiditype +Print textual description of bidi type given as first argument. +end + define pgx set $g = $arg0 # CHAR_GLYPH @@ -475,6 +502,11 @@ else printf " pos=%d", $g->charpos end + # For characters, print their resolved level and bidi type + if ($g->type == 0) + printf " blev=%d,btyp=", $g->resolved_level + pbiditype $g->bidi_type + end printf " w=%d a+d=%d+%d", $g->pixel_width, $g->ascent, $g->descent # If not DEFAULT_FACE_ID if ($g->face_id != 0) diff -r 9e8415b885ee -r b1e1b45c9fb6 src/ChangeLog.bidi --- a/src/ChangeLog.bidi Fri Jan 01 09:46:25 2010 -0500 +++ b/src/ChangeLog.bidi Fri Jan 01 09:57:27 2010 -0500 @@ -1,3 +1,15 @@ +2009-12-19 Eli Zaretskii + + * buffer.c (Fbuffer_swap_text): Swap the values of + bidi_display_reordering and bidi_paragraph_direction. + + * bidi.c (bidi_resolve_weak): Fix nesting of conditions for Wn + processing. Move W3 after W1 and W2. Simplify W4 because it is + now always after W1. + + * .gdbinit (pbiditype): New command. + (pgx): Use it to display bidi level and type of the glyph. + 2009-12-12 Eli Zaretskii * dispextern.h (struct it): New members prev_stop and diff -r 9e8415b885ee -r b1e1b45c9fb6 src/bidi.c --- a/src/bidi.c Fri Jan 01 09:46:25 2010 -0500 +++ b/src/bidi.c Fri Jan 01 09:57:27 2010 -0500 @@ -1342,123 +1342,121 @@ type = STRONG_R; else if (override == L2R) type = STRONG_L; - else if (type == STRONG_AL) - type = STRONG_R; /* W3 */ - else if (type == WEAK_NSM) /* W1 */ + else { - /* Note that we don't need to consider the case where the prev - character has its type overridden by an RLO or LRO: such - characters are outside the current level run, and thus not - relevant to this NSM. Thus, NSM gets the orig_type of the - previous character. */ - if (bidi_it->prev.type != UNKNOWN_BT) - type = bidi_it->prev.orig_type; - else if (bidi_it->sor == R2L) - type = STRONG_R; - else if (bidi_it->sor == L2R) - type = STRONG_L; - else /* shouldn't happen! */ - abort (); - if (type == WEAK_EN /* W2 after W1 */ + if (type == WEAK_NSM) /* W1 */ + { + /* Note that we don't need to consider the case where the + prev character has its type overridden by an RLO or LRO: + such characters are outside the current level run, and + thus not relevant to this NSM. Thus, NSM gets the + orig_type of the previous character. */ + if (bidi_it->prev.type != UNKNOWN_BT) + type = bidi_it->prev.orig_type; + else if (bidi_it->sor == R2L) + type = STRONG_R; + else if (bidi_it->sor == L2R) + type = STRONG_L; + else /* shouldn't happen! */ + abort (); + } + if (type == WEAK_EN /* W2 */ && bidi_it->last_strong.type_after_w1 == STRONG_AL) type = WEAK_AN; - } - else if (type == WEAK_EN /* W2 */ - && bidi_it->last_strong.type_after_w1 == STRONG_AL) - type = WEAK_AN; - else if ((type == WEAK_ES - && (bidi_it->prev.type_after_w1 == WEAK_EN /* W4 */ - && (bidi_it->prev.orig_type == WEAK_EN - || bidi_it->prev.orig_type == WEAK_NSM))) /* aft W1 */ - || (type == WEAK_CS - && ((bidi_it->prev.type_after_w1 == WEAK_EN - && (bidi_it->prev.orig_type == WEAK_EN /* W4 */ - || bidi_it->prev.orig_type == WEAK_NSM)) /* a/W1 */ - || bidi_it->prev.type_after_w1 == WEAK_AN))) /* W4 */ - { - next_char = - bidi_it->bytepos + bidi_it->ch_len >= ZV_BYTE - ? BIDI_EOB : FETCH_CHAR (bidi_it->bytepos + bidi_it->ch_len); - type_of_next = bidi_get_type (next_char, override); - - if (type_of_next == WEAK_BN - || bidi_explicit_dir_char (next_char)) + else if (type == STRONG_AL) /* W3 */ + type = STRONG_R; + else if ((type == WEAK_ES /* W4 */ + && bidi_it->prev.type_after_w1 == WEAK_EN + && bidi_it->prev.orig_type == WEAK_EN) + || (type == WEAK_CS + && ((bidi_it->prev.type_after_w1 == WEAK_EN + && bidi_it->prev.orig_type == WEAK_EN) + || bidi_it->prev.type_after_w1 == WEAK_AN))) { - bidi_copy_it (&saved_it, bidi_it); - while (bidi_resolve_explicit (bidi_it) == new_level - && bidi_it->type == WEAK_BN) - ; - type_of_next = bidi_it->type; - bidi_copy_it (bidi_it, &saved_it); - } - - /* If the next character is EN, but the last strong-type - character is AL, that next EN will be changed to AN when we - process it in W2 above. So in that case, this ES should not - be changed into EN. */ - if (type == WEAK_ES - && type_of_next == WEAK_EN - && bidi_it->last_strong.type_after_w1 != STRONG_AL) - type = WEAK_EN; - else if (type == WEAK_CS) - { - if (bidi_it->prev.type_after_w1 == WEAK_AN - && (type_of_next == WEAK_AN - /* If the next character is EN, but the last - strong-type character is AL, EN will be later - changed to AN when we process it in W2 above. So - in that case, this ES should not be changed into - EN. */ - || (type_of_next == WEAK_EN - && bidi_it->last_strong.type_after_w1 == STRONG_AL))) - type = WEAK_AN; - else if (bidi_it->prev.type_after_w1 == WEAK_EN - && type_of_next == WEAK_EN - && bidi_it->last_strong.type_after_w1 != STRONG_AL) - type = WEAK_EN; - } - } - else if (type == WEAK_ET /* W5: ET with EN before or after it */ - || type == WEAK_BN) /* W5/Retaining */ - { - if (bidi_it->prev.type_after_w1 == WEAK_EN /* ET/BN with EN before it */ - || bidi_it->next_en_pos > bidi_it->charpos) - type = WEAK_EN; - /* W5: ET with EN after it. */ - else - { - EMACS_INT en_pos = bidi_it->charpos + 1; - next_char = bidi_it->bytepos + bidi_it->ch_len >= ZV_BYTE ? BIDI_EOB : FETCH_CHAR (bidi_it->bytepos + bidi_it->ch_len); type_of_next = bidi_get_type (next_char, override); - if (type_of_next == WEAK_ET - || type_of_next == WEAK_BN + if (type_of_next == WEAK_BN || bidi_explicit_dir_char (next_char)) { bidi_copy_it (&saved_it, bidi_it); while (bidi_resolve_explicit (bidi_it) == new_level - && (bidi_it->type == WEAK_BN || bidi_it->type == WEAK_ET)) + && bidi_it->type == WEAK_BN) ; type_of_next = bidi_it->type; - en_pos = bidi_it->charpos; bidi_copy_it (bidi_it, &saved_it); } - if (type_of_next == WEAK_EN) + + /* If the next character is EN, but the last strong-type + character is AL, that next EN will be changed to AN when + we process it in W2 above. So in that case, this ES + should not be changed into EN. */ + if (type == WEAK_ES + && type_of_next == WEAK_EN + && bidi_it->last_strong.type_after_w1 != STRONG_AL) + type = WEAK_EN; + else if (type == WEAK_CS) { - /* If the last strong character is AL, the EN we've - found will become AN when we get to it (W2). */ - if (bidi_it->last_strong.type_after_w1 != STRONG_AL) + if (bidi_it->prev.type_after_w1 == WEAK_AN + && (type_of_next == WEAK_AN + /* If the next character is EN, but the last + strong-type character is AL, EN will be later + changed to AN when we process it in W2 above. + So in that case, this ES should not be + changed into EN. */ + || (type_of_next == WEAK_EN + && bidi_it->last_strong.type_after_w1 == STRONG_AL))) + type = WEAK_AN; + else if (bidi_it->prev.type_after_w1 == WEAK_EN + && type_of_next == WEAK_EN + && bidi_it->last_strong.type_after_w1 != STRONG_AL) + type = WEAK_EN; + } + } + else if (type == WEAK_ET /* W5: ET with EN before or after it */ + || type == WEAK_BN) /* W5/Retaining */ + { + if (bidi_it->prev.type_after_w1 == WEAK_EN /* ET/BN w/EN before it */ + || bidi_it->next_en_pos > bidi_it->charpos) + type = WEAK_EN; + else /* W5: ET/BN with EN after it. */ + { + EMACS_INT en_pos = bidi_it->charpos + 1; + + next_char = + bidi_it->bytepos + bidi_it->ch_len >= ZV_BYTE + ? BIDI_EOB : FETCH_CHAR (bidi_it->bytepos + bidi_it->ch_len); + type_of_next = bidi_get_type (next_char, override); + + if (type_of_next == WEAK_ET + || type_of_next == WEAK_BN + || bidi_explicit_dir_char (next_char)) { - type = WEAK_EN; - /* Remember this EN position, to speed up processing - of the next ETs. */ - bidi_it->next_en_pos = en_pos; + bidi_copy_it (&saved_it, bidi_it); + while (bidi_resolve_explicit (bidi_it) == new_level + && (bidi_it->type == WEAK_BN + || bidi_it->type == WEAK_ET)) + ; + type_of_next = bidi_it->type; + en_pos = bidi_it->charpos; + bidi_copy_it (bidi_it, &saved_it); } - else if (type == WEAK_BN) - type = NEUTRAL_ON; /* W6/Retaining */ + if (type_of_next == WEAK_EN) + { + /* If the last strong character is AL, the EN we've + found will become AN when we get to it (W2). */ + if (bidi_it->last_strong.type_after_w1 != STRONG_AL) + { + type = WEAK_EN; + /* Remember this EN position, to speed up processing + of the next ETs. */ + bidi_it->next_en_pos = en_pos; + } + else if (type == WEAK_BN) + type = NEUTRAL_ON; /* W6/Retaining */ + } } } } diff -r 9e8415b885ee -r b1e1b45c9fb6 src/buffer.c --- a/src/buffer.c Fri Jan 01 09:46:25 2010 -0500 +++ b/src/buffer.c Fri Jan 01 09:57:27 2010 -0500 @@ -2261,6 +2261,8 @@ swapfield (undo_list, Lisp_Object); swapfield (mark, Lisp_Object); swapfield (enable_multibyte_characters, Lisp_Object); + swapfield (bidi_display_reordering, Lisp_Object); + swapfield (bidi_paragraph_direction, Lisp_Object); /* FIXME: Not sure what we should do with these *_marker fields. Hopefully they're just nil anyway. */ swapfield (pt_marker, Lisp_Object);