-
Notifications
You must be signed in to change notification settings - Fork 2.9k
SetReplication Error. #18721
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Now I discovered new information when I execute sudo kubectl logs alluxio-master-0 -c alluxio-master -n cm-alluxio.
I think this might be an issue with metadata synchronization. After I performed the replica count adjustment operation, although the setReplica job appeared to be completed, the file metadata was not actually fully synchronized. |
I would like to know if the Could someone please clarify this for me? I would greatly appreciate it! |
Setreplication changes the number of replicas of an existing file. If the job service is running, it will replicate to the set number or reduce replicas. Hope this helps |
@yuzhu thanks for your kind explanation, but I'm still encountering this issue: I can't continuously set the replication factor for a new file. The first setting took effect but the second attempt didn't work, and the job master pod log shows the following
and the master pod log shows the following
I executed like this
Is there someting wrong with my metadata, which incurred my problem? |
I tried deploying Alluxio directly on the server, without using K8S, it turned to be the same Issue. When I attempted to adjust the replica num of a new file which has been loaded in Alluxio Cache, it always worked for the first attempt, but failed for the second attempt.
By the way, my ufs is hdfs. and my alluxio-site.properties is below:
|
Alluxio Version:
v2.9.4
Describe the bug
First, i deployed the Alluxio with Helm in a K8S cluster which has 1 master node and 7 worker nodes.
Second, when i entered an Alluxio Worker Pod, i tried like this "alluxio fs setReplication --max 3 --min 3 /test_ufs.txt", and it worked pretty good for the first time. However, when i tried another time with "alluxio fs setReplication --max 4 --min 4 /test_ufs.txt", it didn't work, the replication num remained to be 3.
Third, I found some information in Alluxio-master logs:
and information in Alluxio-job-master logs:
Forth, I entered an Alluxio-worker pod and checked the alluxio job list:
it indicated that all the jobs were completed.
My confusion is why the job list says all tasks are completed, but the logs still show that there are setReplication jobs running? This problem prevents me from repeatedly adjusting the number of replicas for a file in Alluxio.
To Reproduce
Steps to reproduce the behavior (as minimally and precisely as possible)
Expected behavior
A clear and concise description of what you expected to happen.
Urgency
Describe the impact and urgency of the bug.
Are you planning to fix it
Yes.
Additional context
properties in values.yaml
The text was updated successfully, but these errors were encountered: