Identity-based controlled delegated outsourcing data integrity auditing scheme

Identity-based controlled delegated outsourcing data integrity auditing scheme

System model

The scheme proposed in this paper contains a total of five entities: DO, KGC, PS, TPA and CSP, and the system model is shown in Fig. 2. Our proposed system model provides fine-grained control and authorization mechanisms to ensure data integrity with efficient dynamic data updating and source file auditing capabilities.

Figure 2
figure 2

DO is the data owner, which stores the data in the cloud. CSP provides a powerful storage service for DO. PS is the proxy service provider where DO authorizes a PS to upload the data stored in CSP. DO blinds the data before uploading it to the PS. Blinding is a data processing technique aimed at concealing or protecting sensitive information while maintaining the utility of the data. In our research, the primary purpose of employing blinding techniques is to safeguard the privacy and confidentiality of the data, preventing unauthorized access and leakage. KGC generates private keys for DO and PS. In our system model, DO and KGC are trusted entities. CSP belongs to the category of incompletely trusted entities, which is trustworthy in data privacy but untrustworthy in data integrity; it may damage and tamper with DO’s data. TPA belongs to the category of semi-trustworthy entities, which is trustworthy in terms of integrity verification but untrustworthy in terms of data privacy; it will fulfill the auditing tasks of DO as required, but it will be curious about the DO’s data content is curious.

Threat models

The proposed scheme faces three main types of attacks:

  1. 1.

    A malicious PS can mimic a DO or another authorized PS or abuse the DO’s authorization delegation by processing the DO’s files and uploading them to CSP storage.

  2. 2.

    Due to hardware failure or self-interest, a malicious CSP may delete or modify the DO’s files, especially those accessed less frequently.

  3. 3.

    During the data outsourcing phase, there is a possibility that the PS or CSP may steal the DO’s data. Semi-trustworthy TPA are unreliable in terms of data privacy and show curiosity about the content of data outsourced by DOs during the audit process.

Security goals

Considering that the scheme may be subject to the above attacks, our scheme aims to achieve the following security objectives:

  1. 1.

    Controllable delegation Proofs of authorization generated by the DO can only be used by the specified PS to outsource specified files. Even authorized PS cannot misuse authorization proofs to outsource unspecified files, and multiple PSs cannot collude to infer new authorization proofs for outsourced unspecified files.

  2. 2.

    Audit correctness Means that only data evidence and tag evidence generated by the CSP is valid simultaneously to pass the TPA validation.

  3. 3.

    Privacy preservation In order to protect the sensitive information of the outsourced data, it is necessary to blind the data during the data outsourcing phase to protect the privacy and security of the data. During the audit process, TPA cannot retrieve the user’s data information.

Overview of the proposed scheme

The identity-based controlled delegated outsourced data integrity auditing scheme proposed in this paper consists of three phases (setup phase, auditing phase, and dynamic updating phase), and the setup and auditing phases contain the following eight polynomial time algorithms(\(SysSetup\), \(KeyGen\),\(DeleGen\),\(DataBlind\),\(DataOutsourcing\),\(ChalGen\), \(ProofGen\) and \(ProofVerify\)).

Setup phase

  1. 1.

    \(SysSetup(1^\kappa ) \to (SysPara,msk)\). System setup algorithm. The KGC executes this algorithm and inputs the system security parameter \(\kappa\), outputs the system global parameter \(SysPara\), and the KGC’s master key \(msk\).

  2. 2.

    \(KeyGen(SysPara,msk,ID_i ) \to sk_i\). Key generation algorithm. KGC executes this algorithm and takes as input the global parameters \(SysPara\), master key \(msk\), and identity \(ID_i\) and outputs the corresponding key \(sk_i\).

  3. 3.

    \(DeleGen(SysPara,ID_o ,sk_o ,ID_p ) \to (\Phi ,\Gamma )\). Delegated authorization proof generation algorithm. The DO executes the algorithm and inputs the global parameter \(SysPara\), the PS’s identity \(ID_p\), the DO’s identity \(ID_o\), and its key \(sk_o\). It outputs the delegated proof of authorization information \(Dele\_Info = (\Phi ,\Gamma )\), where \(\Phi\) is the proof of authorization, and \(\Gamma\) is the delegated proof.

  4. 4.

    \(DataBlind(M,\pi ) \to M^\prime\). Data blinding algorithm. The plaintext data M and the blinding factor \(\pi\) are used as input, and the blinded data \(M^\prime\) is output.

  5. 5.

    \(DataOutsourcing(SysPara,Dele\_Info,sk_p ,M^\prime) \to (\tau ,M^* )\) Data Outsourcing Algorithm. The PS executes the algorithm, which inputs the global parameters \(SysPara\), the delegated authorization certificate information \(Dele\_Info\), the key \(sk_p\) of the PS, and the blinded source data \(M^\prime\) to be outsourced and outputs the corresponding file tag \(\tau\) and the processed data M*.

Audit phase

  1. 1.

    \(ChalGen(SysPara,\tau ) \to chal\). Audit challenge generation algorithm. The TPA executes this algorithm by inputting the global parameters \(SysPara\) and the file tag \(\tau\). After the file tag \(\tau\) passes the legitimacy verification by the TPA, it outputs the audit challenge \(chal\).

  2. 2.

    \(ProofGen(SysPara,chal) \to proof\) Proof generation algorithm. The CSP executes the algorithm. Inputs the global parameters \(SysPara\) and the audit challenge \(chal\) and outputs the audit challenge proof \(proof = (\delta ,\eta )\), where \(\delta\) is the tag evidence, and \(\eta\) is the data evidence.

  3. 3.

    \(ProofVerify(SysPara,\tau ,proof) \to (True/False)\). Proof validation algorithm. The TPA executes this algorithm by inputting the global parameter \(SysPara\), audit challenge proof \(proof\) and the file tag \(\tau\), and outputting True when the evidence validation passes, otherwise outputting False.

Security definitions

This subsection presents formal security definitions to fulfill the above security goals. Two probabilistic polynomial-time adversaries,\(A_1\) and \(A_2\),are used to simulate a malicious PS and a malicious CSP, respectively. \(A_1\) has the capability to simulate collusion in order to forge or misuse the authorization of DO, and it can also simulate the CSP to modify the stored files of DO without being detected.

Game 1: We have defined a security game against malicious PS based on the concept of probabilistic polynomial time. The adversary \(A_1\) plays the following game with the challenger \(C\):

Setup: The challenger \(C\) runs the system setup algorithm \(SysSetup(1^\kappa ) \to (SysPara,msk)\) to obtain the global parameters \(SysPara\) and the master key \(msk\) and sends \(SysPara\) to adversary \(A_1\).

Queries: Adversary \(A_1\) performs multiple polynomial queries to the challenger \(C\), and challenger \(C\) responds to the polynomial queries of the adversary \(A_1\) as follows:

  1. 1.

    Extraction query: The adversary \(A_1\) performs a private key extraction query on all identifiers \(ID_i\). Challenger \(C\) computes the private key \(sk_i\) of \(ID_i\) and sends it to \(A_1\).

  2. 2.

    Delegation query: Adversary \(A_1\) submits proof of delegation \(\Phi\) to challenger \(C\). If DO’s private key \(sk_o\) has yet to be queried before, challenger \(C\) first generates DO’s private key \(sk_o\). Then challenger \(C\) responds with delegation proof \(\Phi\).

  3. 3.

    File processing query: Adversary \(A_1\) submits the proof of authorization \((\Phi ,\Gamma )\) to the challenger \(C\) in this query. If the PS’s private key \(sk_p\) and proof of authorization \(\Phi\) have yet to be queried, challenger \(C\) generates them first. Then challenger \(C\) responds with the processed file \(M^*\).

Output: Eventually, adversary \(A_1\) generates a processed file \(\widehatM^* \) under the legal authorization \(\widehat\Phi \). Adversary \(A_1\) wins the game if the following conditions are met:

  1. 1.

    The adversary \(A_1\) obtains its private key without extracting a query from \(ID_o\);

  2. 2.

    Adversary \(A_1\) has failed to conduct a proxy search on the certificate of authority \(\widehat\Phi \);

  3. 3.

    Adversary \(A_1\) did not perform a file processing query involving proof of authorization \(\widehat\Phi \);

  4. 4.

    Proof of delegation \(\widehat\Gamma \) is \(\widehatID_o \) valid for \(\widehatID_p \).

Definition 1

An identity-based CDODIA scheme is secure against adaptive mimicry and abuse of delegated proofs if any probabilistic polynomial-time adversary \(A_1\) and challenger \(C\) win the above game with negligible probability only.

Game 2: We define security games against malicious CSP with probabilistic polynomial time adversaries \(A_2\). Play the following game with challengers \(C\):

Setup: Challenger \(C\) runs the system setup algorithm \(SysSetup(1^\kappa ) \to (SysPara,msk)\) to obtain the global parameters \(SysPara\) and the master key \(msk\) and sends \(SysPara\) to adversary \(A_2\).

Queries: Challenger \(C\) adaptively interacts with adversary \(A_2\) to enforce the integrity auditing protocol. Here adversary \(A_2\) plays the role of a validator, which can respond to any integrity challenge initiated by the challenger \(C\).

  1. 1.

    Challenge generation query: Challenger \(C\) generates a challenge response for a particular data file and sends it to \(A_2\).

  2. 2.

    Challenge response query: Adversary \(A_2\) generates evidence as a challenge-response based on processed file.

  3. 3.

    Challenge validation query: Challenger \(C\) validates its challenge-response evidence and returns the validation results to the adversary \(A_2\).

Output: Finally, challenger \(C\) and adversary \(A_2\) complete the last round of the integrity audit protocol. Challenger \(C\) sends an audit challenge \(\widehatchal\) against a processed file \(\widehatM^* \), and adversary \(A_2\) generates evidence \(\widehatproof\) as a challenge-response. Assume that the file \(\widehatM^* \) stored in the CSP has been corrupted or modified and that the audit challenge \(\widehatchal\) contains the corrupted or modified data. Suppose evidence \(\widehatproof\) contains a valid tuple \((\widehat\vartheta ,\ \widehat\eta_j \_1 \le j \le c )\) for the audit challenge \(\widehatchal\) and the processed file \(\widehatM^* \), and the tuple is not the same as the audit challenge \(chal\) and the correctly maintained file \(\widehat{M^* }\). In that case, it is stated that adversary \(A_2\) wins the game.

Definition 2

An identity-based CDODIA scheme is secure against data modification attacks if any probabilistic polynomial-time adversary \(A_2\) wins the above game only with negligible probability.

Detailed description of the proposed scheme

In this section, the three stages of the proposed scheme are described in detail, and the audit process is illustrated in Fig. 3.

Figure 3
figure 3

Schematic diagram of the audit process.

Detailed description of setup phase

$$ SysSetup(1^\kappa ) \to (SysPara,msk) $$

KGC chooses two multiplicative cyclic groups \(G_1\) and \(G_2\) of large prime order p, based on the security parameter \(\kappa\). g is a randomly generated element,\(g \in G_1\). Selecting a bilinear pairing function \(e:G_1 \times G_1 \to G_2\). KGC randomly selects the integer \(x \in \mathbbZ_p^*\) and the element \((g_1 ,\ \mu_i \_0 \le i \le l ,\ v_i \_0 \le i \le \ell ,\ u_i \_0 \le i \le n ) \in G_1\), and computes \(y = g^x\), the master key \(msk = g_1^x\). Three collision-resistant hash functions are chosen: \(H_1 :\ 0,1\^* \to \ 0,1\^l\),\(H_2 :\ 0,1\^* \to \ 0,1\^\ell \),and \(H_3 :\ 0,1\^* \to G_1\).

The final system parameters \(SysPara = \ G_1 ,G_2 ,e,p,g_1 ,y,H_1 ,H_2 ,H_3 ,\ \mu_i \_0 \le i \le l ,\ v_i \_0 \le i \le \ell ,\ u_i \_0 \le i \le n \\) are published, and the master key \(msk\) is kept secretly by KGC itself.

$$ KeyGen(SysPara,msk,ID_i ) \to sk_i $$

After receiving the identity \(ID_i\) from DO or PS, KGC generates the private key \(sk_i\) for DO and PS according to its own master key \(msk\), which is as follows: first, KGC calculates

$$ \widetildeu_i = (\textu_i,1 ,\textu_i,2 , \ldots ,\textu_i,l ) \leftarrow H_1 (ID_i ) $$

(1)

Then, it randomly selects \(\sigma_i \in \mathbbZ_p^*\) and generates the private key \(sk_i = (sk_i,1 ,sk_i,2 )\), where

$$ sk_i,1 = msk \cdot (\mu_0 \cdot \prod\limits_j = 1^l \mu_j^\textu_i,j )^\sigma_i $$

(2)

and

$$ sk_i,2 = g^\sigma_i $$

(3)

Finally, send \(sk_i\) to DO or PS. DO, and PS can verify whether their private key \(sk_i\) is legitimate by calculating \(\widetildeu_i\) and checking the equation

$$ e(sk_i,1 ,g)\mathop = \limits^? e(y,g_1 ) \cdot e \left(\mu_0 \cdot \prod\limits_j = 1^l {\mu_j^{\textu_i,j } } ,sk_i,2\right ) $$

(4)

If the equation is valid, they accept the private key \(sk_i\) and reject it otherwise.

$$ DeleGen(SysPara,ID_o ,sk_o ,ID_p ) \to (\Phi ,\Gamma ) $$

To authorize a legitimate proxy PS, the DO generates an authorization proof \(\Phi\) containing the DO’s identity \(ID_o\) and the proxy PS’s identity \(ID_p\). The authorization proof \(\Phi\) may also contain other information related to the source data M, such as the time of delegation, file type, etc., i.e., \(\Phi = (ID_o ||ID_p ||TimeStamp||Type)\). Based on the system parameter \(SysPara\), the DO first calculates

$$ \widetilde\varphi = (\phi_1 ,\phi_2 , \ldots ,\phi_\ell ) \leftarrow H_2 (\Phi ) $$

(5)

Then, it randomly selects \(\sigma_\phi \in \mathbbZ_p^*\) and generates the proof of delegation \(\Gamma = (\alpha ,\beta ,\gamma )\), where

$$\alpha = sk_o,1 \cdot \left(v_0 \cdot \prod\limits_j = 1^\ell v_j^\phi_j \right )^\sigma_\phi , \;\; \beta = sk_o,2, \;\; \gamma = g^\sigma_\phi .$$

Finally, the proof of delegation \((\Phi ,\Gamma )\) is sent to the proxy PS. After receiving the information of the proof of delegation from the DO, the PS can verify the legitimacy of the delegation information by calculating \(\widetildeu_o\) and \(\widetilde\varphi \) as well as checking the equation

$$ e(\alpha ,g)\mathop = \limits^? e(y,g_1 ) \cdot e(sk_o,1^{\frac1\sigma_o } ,\beta ) \cdot e \left (v_0 \cdot \prod\limits_j = 1^\ell {v_j^\phi_j } ,\gamma \right) $$

(6)

If the equation is valid, the PS receives the delegation from the DO; otherwise the delegation request is rejected.

$$ DataBlind(M,\pi ) \to M^\prime $$

Given an outsourced data source file \(M \in \ 0,1\^*\) with legitimate proof-of-authorization information, the DO divides the file into n data blocks,\(M = \ m_1 ,m_2 , \ldots ,m_n \\). \(\pi \in \mathbbZ_p^*\) is randomly selected as the blinding factor, and each blinded data block \(m_i ^\prime = (m_i ||i) + \pi\) is computed. Finally, the blinded file \(M^\prime\) is sent to the PS.

$$ DataOutsourcing(SysPara,Dele\_Info,sk_p ,M^\prime) \to (\tau ,M^* ) $$

After receiving the blinded data from the DO, firstly, the PS divides the blinded data \(M^\prime\) into c \((1 \le c \le n)\) sectors, i.e., \(M^\prime = \ m_i,j ^\prime\_r \times c\). Second, PS randomly selects the source file identifier \(F_id \in \mathbbZ_p^*\) and the random value \(\sigma_F \in \mathbbZ_p^*\) and computes \(v_F = g^\sigma_F \) and sets \(\tau_0 = \Phi ||F_id ||v_F ||r\). PS selects the signature algorithm \(PS.Sign = (KeyGen,Sign,Verify)\)32 to sign \(\tau_0\) and generates the source file tag \(\tau = \tau_0 ||PS.Sign.Sign(\tau_0 ,ssk)||spk\), where the signed public–private key pair \((ssk,spk)\) is generated by the algorithm \(PS.Sign.KeyGen(1^\kappa )\). Then PS uses its private key \(sk_p\) to sign the data block

$$ \vartheta_i = sk_p,1 \cdot \left(H_3 (\Psi ) \cdot \prod\limits_j = 1^c u_j^m_i,j ^\prime \right)^\sigma_F $$

(7)

where \(\Psi = \Phi ||F_id ||i\). Finally, the processed source files \(M^* = (M^\prime,F_id ,\Gamma ,\ \vartheta_i \_1 \le i \le c )\) and \((\tau ,\ \vartheta_i \_1 \le i \le c )\) are sent to CSP and TPA, respectively, and \(M^*\) and \(\ \vartheta_i \_1 \le i \le c\) are deleted locally.

Detailed description of audit phase

In this phase, the TPA performs integrity auditing of the cloud data from time to time, which contains the following algorithms:

$$ ChalGen(SysPara,\tau ) \to chal $$

Before the integrity audit begins, the TPA runs the algorithm \(PS.Sign.Verify(\tau_0 ,spk)\) to verify the legitimacy of the file tag \(\tau\). If the validation fails, the audit request is rejected; if it passes the validation, TPA randomly selects a non-empty subset \(S_F\) from the set \([1,r]\) and randomly selects \(s_i \in \mathbbZ_p^*\) to generate the audit challenge \(chal = (i,s_i )_i \in S_F \), and finally sends \((chal,F_id )\) to the CSP.

$$ ProofGen(SysPara,chal) \to proof $$

Upon receiving the audit challenge from the TPA, the CSP locates the files \(M^*\) to be integrity audited based on \((chal,F_id )\) and calculates the tag proof

$$ \vartheta = \prod\limits_i \in S_F \vartheta_i^s_i $$

(8)

and data proof

$$ \eta_j = \sum\limits_i \in S_F s_i \cdot m^\prime_i,j (j \in [1,c]) $$

(9)

Finally,\(proof = (\vartheta ,\ \eta_j \_1 \le j \le c )\) is sent to the TPA along with the delegation \(\Gamma\).

$$ ProofVerify(SysPara,\tau ,proof) \to (True/False) $$

After receiving the proof, the TPA calculates \((\widetildeu_o ,\widetildeu_p ,\widetilde\varphi )\) based on the system parameters \(SysPara\) and the file tag \(\tau\). Then TPA verifies the legitimacy of the delegation \(\Gamma\) according to Eq. (6); if the equation is valid, it means that the PS has a proof of legitimate authorization. Finally, TPA performs an integrity audit by checking equation

$$ \begingathered e(\vartheta ,g)\mathop = \limits^? (e(y,g_1 ) \cdot e(sk_p,1^{\frac1\sigma_p } ,sk_p,2 ))^{\sum\limits_i \in S_F s_i } \hfill \\ \cdot e \left(\prod\limits_{i \in S_F } {H_3 (\Psi )^s_i \cdot \prod\limits_j = 1^c {u_j^\eta_j } } ,v_F \right) \hfill \\ \endgathered $$

(10)

If the equation is valid and outputs True, the file stored in the CSP is complete; otherwise, if it outputs False, the file is incomplete.

Detailed description of dynamic update phase

The B-RBT structure proposed in this scheme can effectively support dynamic updating of data (modification, insertion and deletion) in the following process:

  1. 1.

    Data insertion process: when DO needs to insert a new data block after a certain data block, DO generates the corresponding dynamic updated information \(Update\_Info\_PS = (Insert,i,Dele\_Info^\prime,M^\prime)\) and sends it to PS, where Insert is the insert command, i is the location to be inserted,\(M^\prime\) is the newly inserted data, and \(Dele\_Info^\prime\) is the delegated authorization proof information of the new data \(M^\prime\). After receiving the dynamic updated information, PS generates the new updated information \(Update\_Info\_TPA = (Insert,i,M^^\prime* )\) after processing the new data \(M^\prime\), and sends the new updated information to TPA. The TPA receives the update information and stores the signature of the data block to be inserted in the B-RBT data structure, inserts a new list node at the specified position of the specified bucket B, and calls the repair process of bucket B. The TPA receives the updated information and stores the signature of the data block to be inserted in the B-RBT data structure. When the bucket exceeds the maximum threshold, if the repair pointer PK has not yet reached the root node R, the repair process of B is called repeatedly until PK reaches R. Then bucket B is split into two buckets \(\ B_1 ,B_2 \\), and the new parent node B12 of \(\ B_1 ,B_2 \\) is added to the tree in the form of a red node, and the pointers of the two new buckets are made to point to the newly inserted nodes. Finally, the repair process is called again for one of the two buckets to complete the signature storage and update these data records in the CSP. Figure 4a shows that after inserting data into bucket S in Fig. 1, the capacity of its bucket exceeds over the maximum threshold, so bucket S is divided into two buckets, node K is used to manage the two new buckets and node K is labeled as red.

    Figure 4
    figure 4
  2. 2.

    Data deletion process: When the DO needs to delete a block of data, DO generates the corresponding dynamic updated information \(Update\_Info\_PS = (Delete,i,Dele\_Info^ \bullet )\) and sends it to PS, where Delete is the delete command, i is the location of the deleted data block, and \(Dele\_Info^ \bullet \) is the delegated authorization proof information of the required deleted data block. After receiving the dynamic update information, PS generates the new update information \(Update\_Info\_TPA = (Delete,i)\) and sends the new update information to the TPA. TPA first finds the required list node from bucket B, finds the list nodes required for deletion, and invokes the repair process of bucket B twice. When the bucket is smaller than the minimum threshold if the repair pointer PK has not yet reached the root node R, it repeats the repair process of B until PK reaches R. If the size of the sibling bucket \(B^\prime\) of bucket B exceeds the minimum threshold, B borrows a node from \(B^\prime\) and calls the repair process twice for B. If the size of bucket B‘s sibling bucket \(B^\prime\) does not exceed the minimum threshold, B and \(B^\prime\) are merged to form a new bucket \(B^\prime\prime\), where the key of the right bucket is placed at the end of the left bucket, and the parent node of \(B^\prime\prime\) is the grandfather node of the original B or \(B^\prime\). If the deleted parent bucket is black, mark \(B^\prime\prime\) double black to complete the data block signature deletion and update these data records in the CSP. Figure 4b shows the merging of bucket L and sibling bucket \(L^\prime\) with node M when the size of bucket L‘s sibling bucket \(L^\prime\) will not exceed the minimum threshold after deleting the data of bucket L in Fig. 1.

  3. 3.

    Data modification process: The operation is stored in the CSP to correct certain information. The specific operation and insertion and deletion of the same can be viewed as the original data deletion operation, in the insertion of new data to complete.