New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 634064 link

Starred by 1 user

Issue metadata

Status: Duplicate
Merged: issue 611064
Owner: ----
Closed: Aug 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

Moblab schedule: duplicate entry for host_queue_entries_job_id_and_host_id key

Project Member Reported by sbasi@chromium.org, Aug 3 2016

Issue description

Dan's moblab's scheduler is stuck in a bad state.

IP is 100.96.48.101

Not sure how he got into this state.


08/03 11:33:48.996 DEBUG|               rdb:0416| Host acquisition stats: distinct requests: 3, leased hosts: 1, unsatisfied requests: 79
08/03 11:33:48.996 INFO |  scheduler_models:0530| Assigning host 192.168.231.100 to entry HQE: 1283, for job: 1276 and host: no host has status:Queued
08/03 11:33:49.005 ERROR|     email_manager:0082| Uncaught exception; terminating monitor_db
Traceback (most recent call last):
  File "/usr/local/autotest/scheduler/monitor_db.py", line 179, in main_without_exception_handling
    dispatcher.tick()
  File "/usr/local/autotest/scheduler/site_monitor_db.py", line 106, in tick
    super(SiteDispatcher, self).tick()
  File "/usr/local/autotest/scheduler/monitor_db.py", line 354, in tick
    self._schedule_new_jobs()
  File "/usr/local/autotest/scheduler/site_monitor_db.py", line 158, in _schedule_new_jobs
    super(SiteDispatcher, self)._schedule_new_jobs()
  File "/usr/local/autotest/scheduler/monitor_db.py", line 842, in _schedule_new_jobs
    self._schedule_host_job(host_assignment.host, host_assignment.job)
  File "/usr/local/autotest/scheduler/monitor_db.py", line 800, in _schedule_host_job
    self._host_scheduler.schedule_host_job(host, queue_entry)
  File "/usr/local/autotest/scheduler/host_scheduler.py", line 233, in schedule_host_job
    queue_entry.set_host(host)
  File "/usr/local/autotest/scheduler/scheduler_models.py", line 531, in set_host
    self.update_field('host_id', host.id)
  File "/usr/local/autotest/scheduler/scheduler_models.py", line 308, in update_field
    _db.execute(query, (value, self.id))
  File "/usr/local/autotest/database/database_connection.py", line 312, in execute
    results = self._backend.execute(query, parameters)
  File "/usr/local/autotest/database/database_connection.py", line 132, in execute
    parameters=parameters)
  File "/usr/local/autotest/database/database_connection.py", line 54, in execute
    self._cursor.execute(query, parameters)
  File "/usr/lib64/python2.7/site-packages/django/db/backends/mysql/base.py", line 122, in execute
    six.reraise(utils.IntegrityError, utils.IntegrityError(*tuple(e.args)), sys.exc_info()[2])
  File "/usr/lib64/python2.7/site-packages/django/db/backends/mysql/base.py", line 120, in execute
    return self.cursor.execute(query, args)
  File "/usr/lib64/python2.7/site-packages/MySQLdb/cursors.py", line 205, in execute
    self.errorhandler(self, exc, value)
  File "/usr/lib64/python2.7/site-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
    raise errorclass, errorvalue
IntegrityError: (1062, "Duplicate entry '1276-1' for key 'host_queue_entries_job_id_and_host_id'")
08/03 11:33:49.006 ERROR|     email_manager:0054| monitor_db exception
EXCEPTION: Uncaught exception; terminating monitor_db
Traceback (most recent call last):
  File "/usr/local/autotest/scheduler/monitor_db.py", line 179, in main_without_exception_handling
    dispatcher.tick()
  File "/usr/local/autotest/scheduler/site_monitor_db.py", line 106, in tick
    super(SiteDispatcher, self).tick()
  File "/usr/local/autotest/scheduler/monitor_db.py", line 354, in tick
    self._schedule_new_jobs()
  File "/usr/local/autotest/scheduler/site_monitor_db.py", line 158, in _schedule_new_jobs
    super(SiteDispatcher, self)._schedule_new_jobs()
  File "/usr/local/autotest/scheduler/monitor_db.py", line 842, in _schedule_new_jobs
    self._schedule_host_job(host_assignment.host, host_assignment.job)
  File "/usr/local/autotest/scheduler/monitor_db.py", line 800, in _schedule_host_job
    self._host_scheduler.schedule_host_job(host, queue_entry)
  File "/usr/local/autotest/scheduler/host_scheduler.py", line 233, in schedule_host_job
    queue_entry.set_host(host)
  File "/usr/local/autotest/scheduler/scheduler_models.py", line 531, in set_host
    self.update_field('host_id', host.id)
  File "/usr/local/autotest/scheduler/scheduler_models.py", line 308, in update_field
    _db.execute(query, (value, self.id))
  File "/usr/local/autotest/database/database_connection.py", line 312, in execute
    results = self._backend.execute(query, parameters)
  File "/usr/local/autotest/database/database_connection.py", line 132, in execute
    parameters=parameters)
  File "/usr/local/autotest/database/database_connection.py", line 54, in execute
    self._cursor.execute(query, parameters)
  File "/usr/lib64/python2.7/site-packages/django/db/backends/mysql/base.py", line 122, in execute
    six.reraise(utils.IntegrityError, utils.IntegrityError(*tuple(e.args)), sys.exc_info()[2])
  File "/usr/lib64/python2.7/site-packages/django/db/backends/mysql/base.py", line 120, in execute
    return self.cursor.execute(query, args)
  File "/usr/lib64/python2.7/site-packages/MySQLdb/cursors.py", line 205, in execute
    self.errorhandler(self, exc, value)
  File "/usr/lib64/python2.7/site-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
    raise errorclass, errorvalue
IntegrityError: (1062, "Duplicate entry '1276-1' for key 'host_queue_entries_job_id_and_host_id'")

 

Comment 1 by sbasi@chromium.org, Aug 3 2016

Cc: fdeng@chromium.org shuqianz@chromium.org
Fang, Charlene can you help? This looks similar to  crbug.com/611064 

Comment 2 by dchan@google.com, Aug 3 2016

here are the job that caused this
FAFT bios http://100.96.48.101/afe/#tab_id=view_job&object_id=1278
FAFT ec http://100.96.48.101/afe/#tab_id=view_job&object_id=1279

Comment 3 by fdeng@chromium.org, Aug 3 2016

Did you something similar to #4 in  crbug.com/611064 . That's a unsupported flow that will cause issue. To fix your moblab,
you can try to delete the row in problem from mysql.

Comment 4 by dchan@google.com, Aug 4 2016

- goto http://100.96.48.101/afe/#tab_id=view_job&object_id=1196
- click Clone button and select similar host (default)
- click submit job, it return with error about 0 host select  (can't remember exactly), I then change the Use [1] host for execution and resubmit


Comment 5 by dchan@google.com, Aug 4 2016

I terminate and resubmit the job, looks like the page don't complain about the 0 host any more.  I resubmit the jobs
http://100.96.48.101/afe/#tab_id=view_job&object_id=1438 (BIOS)
http://100.96.48.101/afe/#tab_id=view_job&object_id=1412 (EC)

Comment 6 by sbasi@chromium.org, Aug 4 2016

Mergedinto: 611064
Status: Duplicate (was: Untriaged)
De-duping this with  crbug.com/611064 

Sign in to add a comment