1. Sep 18, 2020
  2. Sep 15, 2020
  3. Sep 11, 2020
  4. Sep 10, 2020
  5. Sep 09, 2020
  6. Aug 17, 2020
  7. Aug 07, 2020
  8. Jul 28, 2020
    • Alberto Miranda's avatar
      Merge branch '75-daemon-chunk-storage-backend-crashes-the-server-on-error' into 'master' · 7778f598
      Alberto Miranda authored
      Resolve "[daemon] chunk storage backend crashes the server on error"
      
      Closes #75
      
      The `ChunkStorage` backend class on the daemon was throwing `system_errors` without being caught, crashing the server in the process. `ChunkStorage` now uses a designated error class for errors that might occur. In addition the dependency to Argobots was removed which was used to trigger `ABT_eventuals`, laying ground work for future non-Argobots IO implementations. Further, the whole class was refactored for consistency and failure resistance.
      
      A new class `ChunkOperation` is introduced which wraps Argobots' IO task operations which allows the removal of IO queue specific code within RPC handlers, i.e., read and write handlers. The idea is to separate eventuals, tasks and their arguments from handler logic into a designated class. Therefore, an object of an inherited class of `ChunkOperation` is instantiated within the handlers that drives all IO tasks. The corresponding code was added to the read and write RPC handlers. Note, `ChunkOperation` is not thread-safe and is supposed to be called by a single thread.
      
      In addition, truncate was reworked for error handling (it crashed the server on error) and that it uses the IO queue as well since truncate causes a write operation and should not overtake IO tasks in the queue.
      
      The chunk stat rpc handler was refactored for error handling and to use error codes as well. 
      
      Further minor changes:
      - dead chunk stat code has been removed
      - some namespaces were missing: `gkfs::rpc`
      - more flexible handler cleanup and response code
      - fixed a bug where the chunk dir wasn't removed when the metadata didn't exist on the same node
      
      Misc:
      There was some discussion about putting the removal of the chunk directory into the IO queue as well with the same argument as truncate, but I refrain to do so as it would likely notably increase remove performance. I think, we can put this under *eventual consistency* and call it a day for now. Truncate was another story as glibc makes heavy use of truncate in various operations.
      
      See merge request !32
      7778f598
    • Marc Vef's avatar
      unlink() - Bugfix: Do not send two RPCs to same hosts in small files case · e1054ae7
      Marc Vef authored and Alberto Miranda's avatar Alberto Miranda committed
      e1054ae7
    • Marc Vef's avatar
      Adding truncate test · 2e2c190b
      Marc Vef authored and Alberto Miranda's avatar Alberto Miranda committed
      2e2c190b
    • Marc Vef's avatar
      daemon: Minor fix for sizeof() in eventual callbacks · 23483f6c
      Marc Vef authored and Alberto Miranda's avatar Alberto Miranda committed
      23483f6c
    • Marc Vef's avatar
      daemon: adding write retry loop; retry for EINTR, EAGAIN, EWOULDBLOCK · 9ab18557
      Marc Vef authored and Alberto Miranda's avatar Alberto Miranda committed
      9ab18557
    • Marc Vef's avatar
      daemon: changing terminology from async to nonblock · f2927d77
      Marc Vef authored and Alberto Miranda's avatar Alberto Miranda committed
      f2927d77
    • Marc Vef's avatar