Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replication: Support GTID tag in PreviousGTIDsEvent #952

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

dveeden
Copy link
Collaborator

@dveeden dveeden commented Nov 21, 2024

Issue: ref #845

The PreviousGTIDsEvent / PREVIOUS_GTIDS_LOG_EVENT has changed to work with tagged GTIDs.

First the uuidCount has changed, it encodes the GTID format. Here format 1 is tagged and format 0 is untagged.

Then each entry may have a tag. If there is a tag then the uuid itself isn't printed but the tag is appended to the last entry.

Example 1

896e7882-18fe-11ef-ab88-22222d34d411:1-3

regular format, compatible with both formats

Example 2

896e7882-18fe-11ef-ab88-22222d34d411:1-4:aaaa:1

tagged format.

Combination of

  • 896e7882-18fe-11ef-ab88-22222d34d411:1-4
  • 896e7882-18fe-11ef-ab88-22222d34d411:aaaa:1

Example 3

896e7882-18fe-11ef-ab88-22222d34d411:1-4:aaaa:1:abc:1-3:bbbbb:1:bbbbbb:1:x:1,896e7882-18fe-11ef-ab88-22222d34d412:1-2

Combination of:

896e7882-18fe-11ef-ab88-22222d34d411:1-4
                                    :aaaa:1
                                    :abc:1-3
                                    :bbbbb:1
                                    :bbbbbb:1
                                    :x:1,
896e7882-18fe-11ef-ab88-22222d34d412:1-2

Please also see: mysqlbinlog --read-from-remote-server --hexdump $binlogfile to see how MySQL encodes/decodes this.

See also:

Issue: ref go-mysql-org#845

The `PreviousGTIDsEvent` / `PREVIOUS_GTIDS_LOG_EVENT` has changed to
work with tagged GTIDs.

First the `uuidCount` has changed, it encodes the GTID format. Here
format 1 is tagged and format 0 is untagged.

Then each entry may have a tag. If there is a tag then the uuid itself
isn't printed but the tag is appended to the last entry.

Examples:

`896e7882-18fe-11ef-ab88-22222d34d411:1-3`

regular format, compatible with both formats

`896e7882-18fe-11ef-ab88-22222d34d411:1-4:aaaa:1`

tagged format.

Combination of

- `896e7882-18fe-11ef-ab88-22222d34d411:1-4`
- `896e7882-18fe-11ef-ab88-22222d34d411:aaaa:1`

`896e7882-18fe-11ef-ab88-22222d34d411:1-4:aaaa:1:abc:1-3:bbbbb:1:bbbbbb:1:x:1,896e7882-18fe-11ef-ab88-22222d34d412:1-2`

Combination of:
```
896e7882-18fe-11ef-ab88-22222d34d411:1-4
                                    :aaaa:1
                                    🔤1-3
                                    :bbbbb:1
                                    :bbbbbb:1
                                    ❌1,
896e7882-18fe-11ef-ab88-22222d34d412:1-2
```

Please also see: `mysqlbinlog --read-from-remote-server --hexdump $binlogfile` to see how MySQL encodes/decodes this.

See also:
- https://dev.mysql.com/doc/refman/8.4/en/replication-gtids-concepts.html
Copy link
Collaborator

@lance6716 lance6716 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure Example 3 in PR description has the same format as MySQL. In https://dev.mysql.com/doc/refman/8.4/en/replication-gtids-concepts.html , there's

GTIDs originating from the same server but having different tags are treated in a manner similar to those originating from different servers, like this:

3E11FA47-71CA-11E1-9E33-C80AA9429562:Domain_1:1-3:15-21, 3E11FA47-71CA-11E1-9E33-C80AA9429562:Domain_2:8-52

rather than 3E11FA47-71CA-11E1-9E33-C80AA9429562:Domain_1:1-3:15-21:Domain_2:8-52. I guess the reason is if Domain_2 is replaced by tag 23, we will see 3E11FA47-71CA-11E1-9E33-C80AA9429562:Domain_1:1-3:15-21:23:8-52, which is ambiguous.

GtidFormatTagged
)

func decodeSid(data []byte) (format GtidFormat, sidnr uint64) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you put the referenced MySQL source code or document in comment, so other reviewers can double check the implementation?

@dveeden
Copy link
Collaborator Author

dveeden commented Nov 26, 2024

I'm not sure Example 3 in PR description has the same format as MySQL. In https://dev.mysql.com/doc/refman/8.4/en/replication-gtids-concepts.html , there's

GTIDs originating from the same server but having different tags are treated in a manner similar to those originating from different servers, like this:

3E11FA47-71CA-11E1-9E33-C80AA9429562:Domain_1:1-3:15-21, 3E11FA47-71CA-11E1-9E33-C80AA9429562:Domain_2:8-52

rather than 3E11FA47-71CA-11E1-9E33-C80AA9429562:Domain_1:1-3:15-21:Domain_2:8-52. I guess the reason is if Domain_2 is replaced by tag 23, we will see 3E11FA47-71CA-11E1-9E33-C80AA9429562:Domain_1:1-3:15-21:23:8-52, which is ambiguous.

mysql-9.1.0> SET gtid_next='automatic:aaa';
Query OK, 0 rows affected (0.00 sec)

mysql-9.1.0> INSERT INTO t1 VALUES (1);
Query OK, 1 row affected (0.01 sec)

mysql-9.1.0> INSERT INTO t1 VALUES (2);
Query OK, 1 row affected (0.01 sec)

mysql-9.1.0> SET gtid_next='automatic:bbbb';
Query OK, 0 rows affected (0.00 sec)

mysql-9.1.0> INSERT INTO t1 VALUES (3);
Query OK, 1 row affected (0.01 sec)

mysql-9.1.0> INSERT INTO t1 VALUES (4);
Query OK, 1 row affected (0.01 sec)

mysql-9.1.0> INSERT INTO t1 VALUES (5);
Query OK, 1 row affected (0.01 sec)

mysql-9.1.0> INSERT INTO t1 VALUES (6);
Query OK, 1 row affected (0.01 sec)

mysql-9.1.0> SET gtid_next='automatic:c';
Query OK, 0 rows affected (0.00 sec)

mysql-9.1.0> INSERT INTO t1 VALUES (7);
Query OK, 1 row affected (0.01 sec)

mysql-9.1.0> SET gtid_next='automatic:d';
Query OK, 0 rows affected (0.00 sec)

mysql-9.1.0> INSERT INTO t1 VALUES (8);
Query OK, 1 row affected (0.01 sec)

mysql-9.1.0> SET gtid_next='automatic';
Query OK, 0 rows affected (0.00 sec)

mysql-9.1.0> INSERT INTO t1 VALUES (9);
Query OK, 1 row affected (0.01 sec)

mysql-9.1.0> SET gtid_next='automatic:123';
ERROR 1774 (HY000): Malformed GTID specification 'automatic:123'.
mysql-9.1.0> show global variables like 'gtid_executed';
+---------------+-----------------------------------------------------------------+
| Variable_name | Value                                                           |
+---------------+-----------------------------------------------------------------+
| gtid_executed | 30c380a5-ac28-11ef-b590-ba33c4017a43:1:aaa:1-2:bbbb:1-4:c:1:d:1 |
+---------------+-----------------------------------------------------------------+
1 row in set (0.02 sec)

mysql-9.1.0> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 9.1.0     |
+-----------+
1 row in set (0.00 sec)

It looks like my example might be correct and the MySQL docs might not be in sync with the actual behavior.

Note that 23 can't be a valid tag.

From https://dev.mysql.com/doc/dev/mysql-server/latest/classmysql_1_1gtid_1_1Tag.html

Representation of the GTID tag.

Tag format: [a-z_]{a-z0-9_}{0,31} Tag may be created from a given text. Text may contain leading and trailing 
spaces which will be omitted during Tag creation. Text may also contain uppercase characters. Acceptable 
format for text is as follows: [:space:][a-zA-Z_][a-zA-Z0-9_]{0,31}[:space:]

@dveeden
Copy link
Collaborator Author

dveeden commented Nov 26, 2024

And with the exact GTIDs from the docs:

mysql-9.1.0> reset binary logs and gtids;
Query OK, 0 rows affected (0.03 sec)

mysql-9.1.0> set global gtid_purged='3E11FA47-71CA-11E1-9E33-C80AA9429562:Domain_1:1-3:15-21, 
    '> 3E11FA47-71CA-11E1-9E33-C80AA9429562:Domain_2:8-52';
Query OK, 0 rows affected (0.02 sec)

mysql-9.1.0> show global variables like 'gtid%';
+----------------------------------+-----------------------------------------------------------------------+
| Variable_name                    | Value                                                                 |
+----------------------------------+-----------------------------------------------------------------------+
| gtid_executed                    | 3e11fa47-71ca-11e1-9e33-c80aa9429562:domain_1:1-3:15-21:domain_2:8-52 |
| gtid_executed_compression_period | 0                                                                     |
| gtid_mode                        | ON                                                                    |
| gtid_owned                       |                                                                       |
| gtid_purged                      | 3e11fa47-71ca-11e1-9e33-c80aa9429562:domain_1:1-3:15-21:domain_2:8-52 |
+----------------------------------+-----------------------------------------------------------------------+
5 rows in set (0.00 sec)

@dveeden
Copy link
Collaborator Author

dveeden commented Nov 26, 2024

I've filed https://bugs.mysql.com/bug.php?id=116789 for this

@dveeden
Copy link
Collaborator Author

dveeden commented Nov 27, 2024

I also filed https://bugs.mysql.com/bug.php?id=116747 some days ago

Copy link
Collaborator

@lance6716 lance6716 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a pure code-related review. I'll check GTID encoding and decoding soon

uuidCount := binary.LittleEndian.Uint16(data[pos : pos+8])

gtidinfo := make([]byte, 8)
copy(gtidinfo, data[:8])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that the new copy is caused by decodeSid will modify data in-place. How about let decodeSid copy the needed data? So there's no risk that in future we need to change the code and caller forget to copy it.

replication/event.go Outdated Show resolved Hide resolved
if format == GtidFormatTagged {
tagLength := int(data[pos]) / 2
pos += 1
if tagLength > 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can the tag have zero length? please add comments

@@ -254,9 +298,14 @@ func (e *PreviousGTIDsEvent) Decode(data []byte) error {
}
intervals[i] = interval
}
previousGTIDSets[i] = fmt.Sprintf("%s:%s", uuid, strings.Join(intervals, ":"))
if isTag {
previousGTIDSets[currentSetnr-1] += fmt.Sprintf(":%s:%s", tag, strings.Join(intervals, ":"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because string is immutable, string += will create a new string object each time. If there are too many tags the performance will drop. Maybe store every "tag+intervals" and join them once at all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants