Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[linux-6.6.y] x86/tsc: Make cur->adjusted values in package#1 to be the same #265

Merged

Conversation

leoliu-oc
Copy link
Contributor

zhaoxin inclusion
category: bugfix
CVE: NA


When resume from S4 on Zhaoxin 2 packages platform that support X86_FEATURE_TSC_ADJUST, the following warning messages appear:

[ 327.445302] [Firmware Bug]: TSC ADJUST differs: CPU15 45960750 --> 78394770. Restoring
[ 329.209120] [Firmware Bug]: TSC ADJUST differs: CPU14 45960750 --> 78394770. Restoring
[ 329.209128] [Firmware Bug]: TSC ADJUST differs: CPU13 45960750 --> 78394770. Restoring
[ 329.209138] [Firmware Bug]: TSC ADJUST differs: CPU12 45960750 --> 78394770. Restoring
[ 329.209151] [Firmware Bug]: TSC ADJUST differs: CPU11 45960750 --> 78394770. Restoring
[ 329.209160] [Firmware Bug]: TSC ADJUST differs: CPU10 45960750 --> 78394770. Restoring
[ 329.209169] [Firmware Bug]: TSC ADJUST differs: CPU9 45960750 --> 78394770. Restoring

The reason is:

Step 1: Bring up.
TSC is sync after bring up with following settings:
MSR 0x3b cur->adjusted
Package#0 CPU 0-7 0 0
Package#1 first CPU value1 value1
Package#1 non-first CPU value1 value1

Step 2: Suspend to S4.
Settings in Step 1 are not changed in this Step.

Step 3: Bring up caused by S4 wake up event.
TSC is sync when bring up with following settings:
MSR 0x3b cur->adjusted
Package#0 CPU 0-7 0 0
Package#1 first CPU value2 value2
Package#1 non-first CPU value2 value2

Step 4: Resume from S4.
When resuming from S4, Current TSC synchronous mechanism
cause following settings:
MSR 0x3b cur->adjusted
Package#0 CPU 0-7 0 0
Package#1 first CPU value2 value2
Package#1 non-first CPU value2 value1

In these Steps, value1 != 0 and value2 != value1.

In Step4, as function tsc_store_and_check_tsc_adjust() do, when the value of MSR 0x3b on the non-first online CPU in package#1 is equal to the value of cur->adjusted on the first online CPU in the same package, the cur->adjusted value on this non-first online CPU will hold the old value1. This cause function tsc_verify_tsc_adjust() set the value of MSR 0x3b on the non-first online CPUs in the package#1 to the old value1 and print the beginning warning messages.

Fix it by setting cur->adjusted value on the non-first online CPU in a package to the value of MSR 0x3b on the same CPU when they are not equal.

zhaoxin inclusion
category: bugfix
CVE: NA

-----------------

When resume from S4 on Zhaoxin 2 packages platform that support
X86_FEATURE_TSC_ADJUST, the following warning messages appear:

[  327.445302] [Firmware Bug]: TSC ADJUST differs: CPU15 45960750 --> 78394770. Restoring
[  329.209120] [Firmware Bug]: TSC ADJUST differs: CPU14 45960750 --> 78394770. Restoring
[  329.209128] [Firmware Bug]: TSC ADJUST differs: CPU13 45960750 --> 78394770. Restoring
[  329.209138] [Firmware Bug]: TSC ADJUST differs: CPU12 45960750 --> 78394770. Restoring
[  329.209151] [Firmware Bug]: TSC ADJUST differs: CPU11 45960750 --> 78394770. Restoring
[  329.209160] [Firmware Bug]: TSC ADJUST differs: CPU10 45960750 --> 78394770. Restoring
[  329.209169] [Firmware Bug]: TSC ADJUST differs: CPU9 45960750 --> 78394770. Restoring

The reason is:

Step 1: Bring up.
	TSC is sync after bring up with following settings:
				MSR 0x3b	cur->adjusted
	Package#0 CPU 0-7	  0		  0
	Package#1 first CPU	  value1	  value1
	Package#1 non-first CPU	  value1	  value1

Step 2: Suspend to S4.
	Settings in Step 1 are not changed in this Step.

Step 3: Bring up caused by S4 wake up event.
	TSC is sync when bring up with following settings:
				MSR 0x3b	cur->adjusted
	Package#0 CPU 0-7	  0		  0
	Package#1 first CPU	  value2	  value2
	Package#1 non-first CPU	  value2	  value2

Step 4: Resume from S4.
	When resuming from S4, Current TSC synchronous mechanism
	cause following settings:
				MSR 0x3b	cur->adjusted
	Package#0 CPU 0-7	  0		  0
	Package#1 first CPU	  value2	  value2
	Package#1 non-first CPU	  value2	  value1

In these Steps, value1 != 0 and value2 != value1.

In Step4, as function tsc_store_and_check_tsc_adjust() do, when the value
of MSR 0x3b on the non-first online CPU in package#1 is equal to the value
of cur->adjusted on the first online CPU in the same package, the
cur->adjusted value on this non-first online CPU will hold the old value1.
This cause function tsc_verify_tsc_adjust() set the value of MSR 0x3b on
the non-first online CPUs in the package#1 to the old value1 and print the
beginning warning messages.

Fix it by setting cur->adjusted value on the non-first online CPU in a
package to the value of MSR 0x3b on the same CPU when they are not equal.

Signed-off-by: leoliu-oc <[email protected]>
@deepin-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign zccrs for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@deepin-ci-robot
Copy link

Hi @leoliu-oc. Thanks for your PR.

I'm waiting for a deepin-community member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@opsiff
Copy link
Member

opsiff commented Jun 25, 2024

/ok-to-test

@opsiff opsiff merged commit 729d693 into deepin-community:linux-6.6.y Jun 25, 2024
2 of 3 checks passed
@deepin-ci-robot
Copy link

deepin pr auto review

关键摘要:

  • 在新增代码中,对boot_cpu_data.x86_vendor的检查应该放在if (bootval != ref->adjusted)的条件判断之外,以确保在bootvalref->adjusted相等时也能执行该代码。
  • 新增代码中,对X86_VENDOR_CENTAURX86_VENDOR_ZHAOXIN的检查应该添加注释说明其目的,以便其他开发者理解这一逻辑。
  • 新增代码中,变量bootval被多次使用,建议将其定义为局部变量以避免潜在的重复使用问题。

是否建议立即修改:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants