International Federation of Digital Seismograph Networks

Thread: Proposal for a central FDSN data center registry

None
Started: 2019-07-09 06:08:42
Last activity: 2019-07-11 12:44:03
Chad Trabant
2019-07-09 06:08:42
Dear Working Group III members,

EIDA and IRIS have collaborated on the development of a data center registry for the FDSN and will be proposing this as a standard, centralized FDSN service at the meeting in Montreal. To allow (a bit of) review prior to the meeting and an opportunity for feedback for those not able to attend the meaning, we introduce this service below.

The FDSN data center registry is designed to address:
discovery of FDSN data centers (in machine-readable format)
discovery of services offered by each data center
identification of primary, secondary, etc. center for data sets offered by each center
Entries in the registry, after being approved, are maintained by the data centers, with multiple options for submitting updates.

The prototype landing page for the central registry is here:

http://www.fdsn.org/datacenters/

This page would ultimately replace the manually-maintained, HTML-only list here: http://www.fdsn.org/webservices/datacenters/

A web service on www.fdsn.org (linked from the landing page), provides a machine-usable interface to search for entries and return them in JSON format.

The JSON format used to exchange the data center registry infomation is described here:

https://github.com/iris-edu/datacenter-registry

Feedback welcome and appreicated.

regards,

Javier Quinteros (GEOFON)
Chad Trabant (IRIS DMC)



  • Philip Crotwell
    2019-07-10 16:57:49
    HI

    Thanks for putting this together. Two comment on this.

    First, could you clarify how the url to a web service should be entered?
    The examples show things like:
    http://datacenter.org/fdsnws/dataselect/1
    but of course the actual web service url would be
    http://datacenter.org/fdsnws/dataselect/1/query
    along with queryauth, version, etc.

    Is the url meant to be for a human or for client software to access
    directly? I would prefer the later (or both), but would rather client
    software not to have to guess if it needs to add "query" appended to the
    path. And so would be helpful to clearly specify whether the "service
    method" should be appended to the url or not.

    Second, I am leery of combining the dataset and priority information with
    the list of web services into a single access point. The dataset
    information is almost by definition going to be out of date and I fear it
    is unlikely to be updated regularly. It also seems to conflate waveform
    data access with all the other possible services. In the example, the
    ARCHIVE repository within the example datacenter may have priority 1 for
    waveform data from the XX network, but that fact likely has little to do
    with the fdsn-event web service. I would prefer a clean, simple list of
    "web services" from each data center, like a naming service, without the
    added complexity of the nested repositories and datasets. If getting the
    information of which datacenter is priority for which network/station is
    considered important, at least let it be a separate service.

    thanks
    Philip


    On Tue, Jul 9, 2019 at 9:10 AM Chad Trabant <chad<at>iris.washington.edu>
    wrote:

    Dear Working Group III members,

    EIDA and IRIS have collaborated on the development of a data center
    registry for the FDSN and will be proposing this as a standard, centralized
    FDSN service at the meeting in Montreal. To allow (a bit of) review prior
    to the meeting and an opportunity for feedback for those not able to attend
    the meaning, we introduce this service below.

    The FDSN data center registry is designed to address:

    - discovery of FDSN data centers (in machine-readable format)
    - discovery of services offered by each data center
    - identification of primary, secondary, etc. center for data sets
    offered by each center

    Entries in the registry, after being approved, are maintained by the data
    centers, with multiple options for submitting updates.

    The prototype landing page for the central registry is here:

    http://www.fdsn.org/datacenters/

    This page would ultimately replace the manually-maintained, HTML-only list
    here: http://www.fdsn.org/webservices/datacenters/

    A web service on www.fdsn.org (linked from the landing page), provides a
    machine-usable interface to search for entries and return them in JSON
    format.

    The JSON format used to exchange the data center registry infomation is
    described here:

    https://github.com/iris-edu/datacenter-registry

    Feedback welcome and appreicated.

    regards,

    Javier Quinteros (GEOFON)
    Chad Trabant (IRIS DMC)



    ----------------------
    FDSN Working Group III
    Topic home: http://www.fdsn.org/message-center/topic/fdsn-wg3-products/ |
    Unsubscribe: fdsn-wg3-products-unsubscribe<at>lists.fdsn.org

    Sent from the FDSN Message Center (http://www.fdsn.org/message-center/)
    Update subscription preferences at http://www.fdsn.org/account/profile/


    • Javier Quinteros
      2019-07-11 11:24:28
      Dear Philip,

      thanks a lot for the feedback. Please find my comments inline below.

      On 10.07.19 22:59, Philip Crotwell wrote:

      HI

      Thanks for putting this together. Two comment on this.

      First, could you clarify how the url to a web service should be entered?
      The examples show things like:
      http://datacenter.org/fdsnws/dataselect/1
      but of course the actual web service url would be
      http://datacenter.org/fdsnws/dataselect/1/query
      along with queryauth, version, etc.

      All services have different methods defined by their specifications and
      usually only one entry point. This is exactly what we are including in
      the "url". The rest could be completed by the client reading this
      information knowing that the service is "fdsnws-dataselect-1". The same
      for other services. It is not the aim of this system to show how to use
      different services, but where is actually the data and which services
      can be used.



      Is the url meant to be for a human or for client software to access
      directly? I would prefer the later (or both), but would rather client
      software not to have to guess if it needs to add "query" appended to the
      path. And so would be helpful to clearly specify whether the "service
      method" should be appended to the url or not.

      As I mentioned above, a client/user should have some previous knowledge
      to use the service. There is no need to "guess". If the client does not
      know that there is a "query" method, it will probably not know how to
      use the parameters to request exactly the data needed.

      And what about a client that wants to check just the version number or
      some features of the different services providing (meta)data? Then, you
      don't want to call the "query" method. You call either "version" or
      "application.wadl". Therefore, there is no need to add the method in
      advance.



      Second, I am leery of combining the dataset and priority information
      with the list of web services into a single access point.  The dataset
      information is almost by definition going to be out of date and I fear
      it is unlikely to be updated regularly. It also seems to conflate
      waveform data access with all the other possible services. In the
      example, the ARCHIVErepository within the example datacenter may have
      priority 1 for waveform data from the XX network, but that fact likely
      has little to do with the fdsn-event web service. I would prefer a
      clean, simple list of "web services" from each data center, like a
      naming service, without the added complexity of the nested repositories
      and datasets. If getting the information of which datacenter is priority
      for which network/station is considered important, at least let it be a
      separate service.

      One of the main reasons why this proposal was conceived comes actually
      from the need of many data centres to formally declare how their data
      holdings should be considered if being harvested. And the link between
      data and services seems to be quite natural. It is clear that data is
      updated more often than services, but it is also true that if services
      are defined in a separate place, what should a harvest client do? Where
      to read the basic information?
      If it is read from the list of services, then the list of datasets must
      also be read, because there is a chance that some datasets should be
      skipped from the harvesting process.
      And if it is read from the datasets list, the list of services should
      also be read in order to get the entry point of the services.

      Again, thanks a lot for the feedback!

      Javier


      thanks
      Philip


      On Tue, Jul 9, 2019 at 9:10 AM Chad Trabant <chad<at>iris.washington.edu
      <chad<at>iris.washington.edu>> wrote:

      Dear Working Group III members,

      EIDA and IRIS have collaborated on the development of a data center
      registry for the FDSN and will be proposing this as a standard,
      centralized FDSN service at the meeting in Montreal.  To allow (a
      bit of) review prior to the meeting and an opportunity for feedback
      for those not able to attend the meaning, we introduce this service
      below.

      The FDSN data center registry is designed to address:

      * discovery of FDSN data centers (in machine-readable format)
      * discovery of services offered by each data center
      * identification of primary, secondary, etc. center for data sets
      offered by each center

      Entries in the registry, after being approved, are maintained by the
      data centers, with multiple options for submitting updates.

      The prototype landing page for the central registry is here:

      http://www.fdsn.org/datacenters/

      This page would ultimately replace the manually-maintained,
      HTML-only list here: http://www.fdsn.org/webservices/datacenters/

      A web service on www.fdsn.org <http://www.fdsn.org (linked from the
      landing page), provides a machine-usable interface to search for
      entries and return them in JSON format.

      The JSON format used to exchange the data center registry infomation
      is described here:

      https://github.com/iris-edu/datacenter-registry

      Feedback welcome and appreicated.

      regards,

      Javier Quinteros (GEOFON)
      Chad Trabant (IRIS DMC)



      ----------------------
      FDSN Working Group III
      Topic home:
      http://www.fdsn.org/message-center/topic/fdsn-wg3-products/ |
      Unsubscribe: fdsn-wg3-products-unsubscribe<at>lists.fdsn.org
      <fdsn-wg3-products-unsubscribe<at>lists.fdsn.org>

      Sent from the FDSN Message Center (http://www.fdsn.org/message-center/)
      Update subscription preferences at http://www.fdsn.org/account/profile/



      ----------------------
      FDSN Working Group III
      Topic home: http://www.fdsn.org/message-center/topic/fdsn-wg3-products/ | Unsubscribe: fdsn-wg3-products-unsubscribe<at>lists.fdsn.org

      Sent from the FDSN Message Center (http://www.fdsn.org/message-center/)
      Update subscription preferences at http://www.fdsn.org/account/profile/


      --
      Javier Quinteros
      -------------------------------------------
      2.4/Seismologie
      Tel.: +49 (0)331/288-1931
      Fax: +49 (0)331/288-1277
      Email: javier<at>gfz-potsdam.de
      ___________________________________

      Helmholtz-Zentrum Potsdam
      Deutsches GeoForschungsZentrum GFZ
      Stiftung des öff. Rechts Land Brandenburg
      Telegrafenberg, 14473 Potsdam


      • Chad Trabant
        2019-07-11 08:44:04
        Hi Philip,

        I am in complete agreement with what Javier wrote, in particular the natural connection between data sets and services. I will only add one small detail: including the 'datasets' information via the proposed service on www.fdsn.org can be controlled via the 'includeDatasets' parameter, which defaults to false. Setting 'includeDatasets' to false (or not setting it) gives a caller a list of services per repository per data center, which is pretty close to what you expressed a desire for.

        regards,
        Chad

        On Jul 11, 2019, at 8:25 AM, Javier Quinteros <javier<at>gfz-potsdam.de> wrote:

        Dear Philip,

        thanks a lot for the feedback. Please find my comments inline below.

        On 10.07.19 22:59, Philip Crotwell wrote:

        HI

        Thanks for putting this together. Two comment on this.

        First, could you clarify how the url to a web service should be entered?
        The examples show things like:
        http://datacenter.org/fdsnws/dataselect/1
        but of course the actual web service url would be
        http://datacenter.org/fdsnws/dataselect/1/query
        along with queryauth, version, etc.

        All services have different methods defined by their specifications and
        usually only one entry point. This is exactly what we are including in
        the "url". The rest could be completed by the client reading this
        information knowing that the service is "fdsnws-dataselect-1". The same
        for other services. It is not the aim of this system to show how to use
        different services, but where is actually the data and which services
        can be used.



        Is the url meant to be for a human or for client software to access
        directly? I would prefer the later (or both), but would rather client
        software not to have to guess if it needs to add "query" appended to the
        path. And so would be helpful to clearly specify whether the "service
        method" should be appended to the url or not.

        As I mentioned above, a client/user should have some previous knowledge
        to use the service. There is no need to "guess". If the client does not
        know that there is a "query" method, it will probably not know how to
        use the parameters to request exactly the data needed.

        And what about a client that wants to check just the version number or
        some features of the different services providing (meta)data? Then, you
        don't want to call the "query" method. You call either "version" or
        "application.wadl". Therefore, there is no need to add the method in
        advance.



        Second, I am leery of combining the dataset and priority information
        with the list of web services into a single access point. The dataset
        information is almost by definition going to be out of date and I fear
        it is unlikely to be updated regularly. It also seems to conflate
        waveform data access with all the other possible services. In the
        example, the ARCHIVErepository within the example datacenter may have
        priority 1 for waveform data from the XX network, but that fact likely
        has little to do with the fdsn-event web service. I would prefer a
        clean, simple list of "web services" from each data center, like a
        naming service, without the added complexity of the nested repositories
        and datasets. If getting the information of which datacenter is priority
        for which network/station is considered important, at least let it be a
        separate service.

        One of the main reasons why this proposal was conceived comes actually
        from the need of many data centres to formally declare how their data
        holdings should be considered if being harvested. And the link between
        data and services seems to be quite natural. It is clear that data is
        updated more often than services, but it is also true that if services
        are defined in a separate place, what should a harvest client do? Where
        to read the basic information?
        If it is read from the list of services, then the list of datasets must
        also be read, because there is a chance that some datasets should be
        skipped from the harvesting process.
        And if it is read from the datasets list, the list of services should
        also be read in order to get the entry point of the services.

        Again, thanks a lot for the feedback!

        Javier


        thanks
        Philip


        On Tue, Jul 9, 2019 at 9:10 AM Chad Trabant <chad<at>iris.washington.edu <chad<at>iris.washington.edu>
        <chad<at>iris.washington.edu <chad<at>iris.washington.edu>>> wrote:

        Dear Working Group III members,

        EIDA and IRIS have collaborated on the development of a data center
        registry for the FDSN and will be proposing this as a standard,
        centralized FDSN service at the meeting in Montreal. To allow (a
        bit of) review prior to the meeting and an opportunity for feedback
        for those not able to attend the meaning, we introduce this service
        below.

        The FDSN data center registry is designed to address:

        * discovery of FDSN data centers (in machine-readable format)
        * discovery of services offered by each data center
        * identification of primary, secondary, etc. center for data sets
        offered by each center

        Entries in the registry, after being approved, are maintained by the
        data centers, with multiple options for submitting updates.

        The prototype landing page for the central registry is here:

        http://www.fdsn.org/datacenters/

        This page would ultimately replace the manually-maintained,
        HTML-only list here: http://www.fdsn.org/webservices/datacenters/

        A web service on www.fdsn.org <http://www.fdsn.org/ <http://www.fdsn.org <http://www.fdsn.org/> (linked from the
        landing page), provides a machine-usable interface to search for
        entries and return them in JSON format.

        The JSON format used to exchange the data center registry infomation
        is described here:

        https://github.com/iris-edu/datacenter-registry

        Feedback welcome and appreicated.

        regards,

        Javier Quinteros (GEOFON)
        Chad Trabant (IRIS DMC)



        ----------------------
        FDSN Working Group III
        Topic home:
        http://www.fdsn.org/message-center/topic/fdsn-wg3-products/ |
        Unsubscribe: fdsn-wg3-products-unsubscribe<at>lists.fdsn.org <fdsn-wg3-products-unsubscribe<at>lists.fdsn.org>
        <fdsn-wg3-products-unsubscribe<at>lists.fdsn.org <fdsn-wg3-products-unsubscribe<at>lists.fdsn.org>>

        Sent from the FDSN Message Center (http://www.fdsn.org/message-center/)
        Update subscription preferences at http://www.fdsn.org/account/profile/



        ----------------------
        FDSN Working Group III
        Topic home: http://www.fdsn.org/message-center/topic/fdsn-wg3-products/ | Unsubscribe: fdsn-wg3-products-unsubscribe<at>lists.fdsn.org <fdsn-wg3-products-unsubscribe<at>lists.fdsn.org>

        Sent from the FDSN Message Center (http://www.fdsn.org/message-center/)
        Update subscription preferences at http://www.fdsn.org/account/profile/


        --
        Javier Quinteros
        -------------------------------------------
        2.4/Seismologie
        Tel.: +49 (0)331/288-1931
        Fax: +49 (0)331/288-1277
        Email: javier<at>gfz-potsdam.de <javier<at>gfz-potsdam.de>
        ___________________________________

        Helmholtz-Zentrum Potsdam
        Deutsches GeoForschungsZentrum GFZ
        Stiftung des öff. Rechts Land Brandenburg
        Telegrafenberg, 14473 Potsdam


        ----------------------
        FDSN Working Group III
        Topic home: http://www.fdsn.org/message-center/topic/fdsn-wg3-products/ | Unsubscribe: fdsn-wg3-products-unsubscribe<at>lists.fdsn.org <fdsn-wg3-products-unsubscribe<at>lists.fdsn.org>

        Sent from the FDSN Message Center (http://www.fdsn.org/message-center/)
        Update subscription preferences at http://www.fdsn.org/account/profile/

        • Philip Crotwell
          2019-07-11 12:44:03
          HI

          Thanks for your rapid responses.

          I am fine with "query" not being part of the URL. What I am asking is that
          the documentation for this clearly state that it should NOT be included.
          Without it being spelled out, some datacenters will put in the "query" in
          the URL and some will not, meaning that a client has to guess. And heaven
          forbid it someone actually sets up a web service where the access point
          really is
          http://datacenter.org/fdsnws/dataselect/1/query/query

          Even better would be for the update mechanisms enforce that the URL for the
          fdsn standard web services match the spec, ie
          <host>/fdsnws/<type>/<version>. Some guidance for non-standard web services
          that may or may not follow this pattern would be nice as well.

          You are right that includeDatasets defaulting to false largely addresses my
          concerns.

          Can a network operator use the update mechanism to delegate primary to
          another datacenter? For example I am responsible for the CO network, but I
          do not operate a datacenter and so I would like for IRIS to be listed as
          the primary source? Documenting how I should proceed would be helpful as I
          expect that to be a relatively common scenario.

          thanks
          Philip

          On Thu, Jul 11, 2019 at 11:45 AM Chad Trabant <chad<at>iris.washington.edu>
          wrote:

          Hi Philip,

          I am in complete agreement with what Javier wrote, in particular the
          natural connection between data sets and services. I will only add one
          small detail: including the 'datasets' information via the proposed service
          on www.fdsn.org can be controlled via the 'includeDatasets' parameter,
          which defaults to false. Setting 'includeDatasets' to false (or not
          setting it) gives a caller a list of services per repository per data
          center, which is pretty close to what you expressed a desire for.

          regards,
          Chad

          On Jul 11, 2019, at 8:25 AM, Javier Quinteros <javier<at>gfz-potsdam.de>
          wrote:

          Dear Philip,

          thanks a lot for the feedback. Please find my comments inline below.

          On 10.07.19 22:59, Philip Crotwell wrote:


          HI

          Thanks for putting this together. Two comment on this.

          First, could you clarify how the url to a web service should be entered?
          The examples show things like:
          http://datacenter.org/fdsnws/dataselect/1
          but of course the actual web service url would be
          http://datacenter.org/fdsnws/dataselect/1/query
          along with queryauth, version, etc.


          All services have different methods defined by their specifications and
          usually only one entry point. This is exactly what we are including in
          the "url". The rest could be completed by the client reading this
          information knowing that the service is "fdsnws-dataselect-1". The same
          for other services. It is not the aim of this system to show how to use
          different services, but where is actually the data and which services
          can be used.



          Is the url meant to be for a human or for client software to access
          directly? I would prefer the later (or both), but would rather client
          software not to have to guess if it needs to add "query" appended to the
          path. And so would be helpful to clearly specify whether the "service
          method" should be appended to the url or not.


          As I mentioned above, a client/user should have some previous knowledge
          to use the service. There is no need to "guess". If the client does not
          know that there is a "query" method, it will probably not know how to
          use the parameters to request exactly the data needed.

          And what about a client that wants to check just the version number or
          some features of the different services providing (meta)data? Then, you
          don't want to call the "query" method. You call either "version" or
          "application.wadl". Therefore, there is no need to add the method in
          advance.



          Second, I am leery of combining the dataset and priority information
          with the list of web services into a single access point. The dataset
          information is almost by definition going to be out of date and I fear
          it is unlikely to be updated regularly. It also seems to conflate
          waveform data access with all the other possible services. In the
          example, the ARCHIVErepository within the example datacenter may have
          priority 1 for waveform data from the XX network, but that fact likely
          has little to do with the fdsn-event web service. I would prefer a
          clean, simple list of "web services" from each data center, like a
          naming service, without the added complexity of the nested repositories
          and datasets. If getting the information of which datacenter is priority
          for which network/station is considered important, at least let it be a
          separate service.


          One of the main reasons why this proposal was conceived comes actually
          from the need of many data centres to formally declare how their data
          holdings should be considered if being harvested. And the link between
          data and services seems to be quite natural. It is clear that data is
          updated more often than services, but it is also true that if services
          are defined in a separate place, what should a harvest client do? Where
          to read the basic information?
          If it is read from the list of services, then the list of datasets must
          also be read, because there is a chance that some datasets should be
          skipped from the harvesting process.
          And if it is read from the datasets list, the list of services should
          also be read in order to get the entry point of the services.

          Again, thanks a lot for the feedback!

          Javier


          thanks
          Philip


          On Tue, Jul 9, 2019 at 9:10 AM Chad Trabant <chad<at>iris.washington.edu
          <chad<at>iris.washington.edu <chad<at>iris.washington.edu>>> wrote:

          Dear Working Group III members,

          EIDA and IRIS have collaborated on the development of a data center
          registry for the FDSN and will be proposing this as a standard,
          centralized FDSN service at the meeting in Montreal. To allow (a
          bit of) review prior to the meeting and an opportunity for feedback
          for those not able to attend the meaning, we introduce this service
          below.

          The FDSN data center registry is designed to address:

          * discovery of FDSN data centers (in machine-readable format)
          * discovery of services offered by each data center
          * identification of primary, secondary, etc. center for data sets
          offered by each center

          Entries in the registry, after being approved, are maintained by the
          data centers, with multiple options for submitting updates.

          The prototype landing page for the central registry is here:

          http://www.fdsn.org/datacenters/

          This page would ultimately replace the manually-maintained,
          HTML-only list here: http://www.fdsn.org/webservices/datacenters/

          A web service on www.fdsn.org <http://www.fdsn.org (linked from the
          landing page), provides a machine-usable interface to search for
          entries and return them in JSON format.

          The JSON format used to exchange the data center registry infomation
          is described here:

          https://github.com/iris-edu/datacenter-registry

          Feedback welcome and appreicated.

          regards,

          Javier Quinteros (GEOFON)
          Chad Trabant (IRIS DMC)



          ----------------------
          FDSN Working Group III
          Topic home:
          http://www.fdsn.org/message-center/topic/fdsn-wg3-products/ |
          Unsubscribe: fdsn-wg3-products-unsubscribe<at>lists.fdsn.org
          <fdsn-wg3-products-unsubscribe<at>lists.fdsn.org
          <fdsn-wg3-products-unsubscribe<at>lists.fdsn.org>>

          Sent from the FDSN Message Center (http://www.fdsn.org/message-center/)
          Update subscription preferences at http://www.fdsn.org/account/profile/



          ----------------------
          FDSN Working Group III
          Topic home: http://www.fdsn.org/message-center/topic/fdsn-wg3-products/ |
          Unsubscribe: fdsn-wg3-products-unsubscribe<at>lists.fdsn.org

          Sent from the FDSN Message Center (http://www.fdsn.org/message-center/)
          Update subscription preferences at http://www.fdsn.org/account/profile/


          --
          Javier Quinteros
          -------------------------------------------
          2.4/Seismologie
          Tel.: +49 (0)331/288-1931
          Fax: +49 (0)331/288-1277
          Email: javier<at>gfz-potsdam.de
          ___________________________________

          Helmholtz-Zentrum Potsdam
          Deutsches GeoForschungsZentrum GFZ
          Stiftung des öff. Rechts Land Brandenburg
          Telegrafenberg, 14473 Potsdam


          ----------------------
          FDSN Working Group III
          Topic home: http://www.fdsn.org/message-center/topic/fdsn-wg3-products/ |
          Unsubscribe: fdsn-wg3-products-unsubscribe<at>lists.fdsn.org

          Sent from the FDSN Message Center (http://www.fdsn.org/message-center/)
          Update subscription preferences at http://www.fdsn.org/account/profile/



          ----------------------
          FDSN Working Group III
          Topic home: http://www.fdsn.org/message-center/topic/fdsn-wg3-products/ |
          Unsubscribe: fdsn-wg3-products-unsubscribe<at>lists.fdsn.org

          Sent from the FDSN Message Center (http://www.fdsn.org/message-center/)
          Update subscription preferences at http://www.fdsn.org/account/profile/