S17+ board disappears

Answered

Comments

7 comments

    Hello!
    From the log, it can be seen that the third hash board has lost its chip and needs maintenance; at the same time, there are frequent network interruptions, so it is recommended to check the network reasons to avoid machine failure caused by frequent network interruptions.

    0
    Comment actions Permalink

    Could it be something w the machine itself? This isnt the first time it happens or that I have ti send it in to Bitmain for repair. I will check on the network connection. Are there any routers you could recommend?

    0
    Comment actions Permalink

    Hello!
    The hashboard that dropped the chip can only be repaired. If there is a problem with the network, you can try to check if there is a problem with the network cable, switch or router port, or if it is loose. Sorry, we don't sell routers, so we can't give advice.

    0
    Comment actions Permalink

    Ok I submitted another ticket so it may be repaired again. Hopefully this time it works. Thanks for the help.

    0
    Comment actions Permalink

    I have similar problem, one hashboard is unable to load by startup, this is what kernel log shows:

    2021-02-26 22:30:38 driver-btm-api.c:397:set_order_clock: chain[0]: set order clock, stragegy 3
    2021-02-26 22:30:38 driver-btm-api.c:397:set_order_clock: chain[1]: set order clock, stragegy 3
    2021-02-26 22:30:38 driver-btm-api.c:397:set_order_clock: chain[2]: set order clock, stragegy 3
    2021-02-26 22:30:39 driver-hash-chip.c:502:set_clock_delay_control: core_data = 0x34
    2021-02-26 22:30:39 driver-btm-api.c:1854:check_clock_counter: freq 50 clock_counter_limit 6
    2021-02-26 22:30:39 voltage[0] = 1940
    2021-02-26 22:30:39 voltage[1] = 1940
    2021-02-26 22:30:39 voltage[2] = 1940
    2021-02-26 22:30:39 power_api.c:226:set_working_voltage_raw: working_voltage_raw = 1940
    2021-02-26 22:30:40 temperature.c:340:calibrate_temp_sensor_one_chain: chain 0 temp sensor NCT218
    2021-02-26 22:30:42 temperature.c:340:calibrate_temp_sensor_one_chain: chain 1 temp sensor NCT218
    2021-02-26 22:30:43 temperature.c:340:calibrate_temp_sensor_one_chain: chain 2 temp sensor NCT218
    2021-02-26 22:30:43 uart.c:72:set_baud: set fpga_baud to 12000000
    2021-02-26 22:30:44 temperature.c:838:get_temp_info: read temp sensor failed: chain = 1, sensor = 0, chip = 14, reg = 0
    2021-02-26 22:30:44 temperature.c:838:get_temp_info: read temp sensor failed: chain = 1, sensor = 0, chip = 14, reg = 1
    2021-02-26 22:30:44 temperature.c:838:get_temp_info: read temp sensor failed: chain = 1, sensor = 1, chip = 10, reg = 0
    2021-02-26 22:30:45 temperature.c:838:get_temp_info: read temp sensor failed: chain = 1, sensor = 1, chip = 10, reg = 1
    2021-02-26 22:30:45 temperature.c:838:get_temp_info: read temp sensor failed: chain = 1, sensor = 2, chip = 54, reg = 0
    2021-02-26 22:30:45 temperature.c:838:get_temp_info: read temp sensor failed: chain = 1, sensor = 2, chip = 54, reg = 1
    2021-02-26 22:30:46 temperature.c:838:get_temp_info: read temp sensor failed: chain = 1, sensor = 3, chip = 50, reg = 0
    2021-02-26 22:30:46 temperature.c:838:get_temp_info: read temp sensor failed: chain = 1, sensor = 3, chip = 50, reg = 1
    2021-02-26 22:30:47 temperature.c:865:get_temp_info: ERROR: chain 1 can get NONE temp info or temp value abnormal, power it off 
    2021-02-26 22:30:48 driver-btm-api.c:264:check_bringup_temp: Bring up temperature is 24
     
    What is c:865?
    0
    Comment actions Permalink

    Hello!
    You can try to clean the dust of the chain 1 hashboard, check whether the small fan of the power supply is running properly, if you do not turn the replacement power supply to try, if still can not restore the recommended maintenance treatment.

    0
    Comment actions Permalink

    Your problem is the lack of a hash board. First of all, you have to use a normal graphics card to test if it can work normally. In this way, it can be seen whether it is the problem of the graphics card or the motherboard. Generally, this happens because the graphics card has a short circuit. For more questions, please contact my Email; 113214237@qq.com

    0
    Comment actions Permalink

Please sign in to leave a comment.