Data deduplication Vs cloud data availability

kalaivani M

kalaivani M

@kalaivani-m-Y29OXC Oct 16, 2024
can anyone clear my doubt.

will the data deduplication affect cloud data availability??

cloud service providers to manage their storage they are looking for deduplicating their similar content.

if they do so, their storage may get freed but , will not this affect the availability of the data in cloud . that is only to have multiple copies we are prefering cloud but this is just reverse of our expectation.

so is deduplication and cloud availability terms are trade offs??

Replies

Welcome, guest

Join CrazyEngineers to reply, ask questions, and participate in conversations.

CrazyEngineers powered by Jatra Community Platform

  • Kaustubh Katdare

    Kaustubh Katdare

    @thebigk Aug 3, 2014

    kalaivani M
    can anyone clear my doubt.

    will the data deduplication affect cloud data availability??
    You need to explain your question a bit in more detail.
  • Anoop Kumar

    Anoop Kumar

    @anoop-kumar-GDGRCn Aug 4, 2014

    Not sure about deduplicate you're refering to, but according to definition
    "data deduplication is a specialized data compression technique for eliminating duplicate copies of repeating data."

    Here I found the #-Link-Snipped-#
    1. Divide the input data into blocks or “chunks.”
    2. Calculate a hash value for each block of data.
    3. Use these values to determine if another block of the same data has already been stored.
    4. Replace the duplicate data with a reference to the object already in the database.
    By the above process, cloud is not removing the data but it is placing most updated data and cleaning out old one.Which seems reasonable.
  • kalaivani M

    kalaivani M

    @kalaivani-m-Y29OXC Aug 4, 2014

    Anoop Kumar
    Not sure about deduplicate you're refering to, but according to definition
    "data deduplication is a specialized data compression technique for eliminating duplicate copies of repeating data."

    Here I found the #-Link-Snipped-#
    1. Divide the input data into blocks or “chunks.”
    2. Calculate a hash value for each block of data.
    3. Use these values to determine if another block of the same data has already been stored.
    4. Replace the duplicate data with a reference to the object already in the database.
    By the above process, cloud is not removing the data but it is placing most updated data and cleaning out old one.Which seems reasonable.
  • kalaivani M

    kalaivani M

    @kalaivani-m-Y29OXC Aug 4, 2014

    thanks for ur reply kumar,

    actually i know the concept of deduplication. the thing is if it replaces the second similar file by pointer. then there will be no similar storage of the original data. just pointer will be there to refer the original one.

    if the original copy gets damaged by natural disaster / compromised by hacker.
    then we are left under risk because we will just be having only the pointer that will be pointing the currently corrupted file.

    so here cloud's availability gets failed.

    then what is the reason so special to consider this deduplication as important??
  • Gollapinni Karthik Sharma

    Gollapinni Karthik Sharma

    @gollapinni-karthik-sharma-EgBEXm Aug 4, 2014

    #-Link-Snipped-#

    <a href="https://www.eurecom.fr/en/publication/4136/download/rs-publi-4136.pdf" target="_blank" rel="nofollow noopener noreferrer">PDF</a>

    Read those for an idea.

    If you do not understand let me know will explain you in detail later
  • kalaivani M

    kalaivani M

    @kalaivani-m-Y29OXC Aug 5, 2014

    thanks for ur idea karhtik,

    i have read those contents.

    but my doubt is cloud availability will get decrease or not ??

    due to this deduplication..
  • Gollapinni Karthik Sharma

    Gollapinni Karthik Sharma

    @gollapinni-karthik-sharma-EgBEXm Aug 6, 2014

    #-Link-Snipped-# : Yes cloud availability will obviously increase. These make chances to increase the availability always.

    Deduplication is an another concept of the cloud. With these back up techniques and all they make the cloud availability more often and increase the chances to make it secure also.

    They use the techniques of Raid 0 to 6 which are the fastest data copying techniques.
  • kalaivani M

    kalaivani M

    @kalaivani-m-Y29OXC Aug 8, 2014

    oh..! well..i got some idea, thanks for your help karthik. thank you CE..!