Как прочитать файл из ZIP с помощью InputStream?

Я должен получить содержимое файла из ZIP-архива (только один файл, я знаю его имя) с использованием SFTP. Единственное, что у меня есть, это ZIP InputStream. В большинстве примеров показано, как получить содержимое с помощью этого оператора:

ZipFile zipFile = new ZipFile("location");

Но, как я уже сказал, у меня нет ZIP-файла на моем локальном компьютере, и я не хочу его загружать. Достаточно ли InputStream для чтения?

UPD: Я делаю так:

import java.util.zip.ZipInputStream;

import com.jcraft.jsch.Channel;
import com.jcraft.jsch.ChannelSftp;
import com.jcraft.jsch.JSch;
import com.jcraft.jsch.Session;

public class SFTP {


    public static void main(String[] args) {

        String SFTPHOST = "host";
        int SFTPPORT = 3232;
        String SFTPUSER = "user";
        String SFTPPASS = "mypass";
        String SFTPWORKINGDIR = "/dir/work";
        Session session = null;
        Channel channel = null;
        ChannelSftp channelSftp = null;
        try {
            JSch jsch = new JSch();
            session = jsch.getSession(SFTPUSER, SFTPHOST, SFTPPORT);
            session.setPassword(SFTPPASS);
            java.util.Properties config = new java.util.Properties();
            config.put("StrictHostKeyChecking", "no");
            session.setConfig(config);
            session.connect();
            channel = session.openChannel("sftp");
            channel.connect();
            channelSftp = (ChannelSftp) channel;
            channelSftp.cd(SFTPWORKINGDIR);
            ZipInputStream stream = new ZipInputStream(channelSftp.get("file.zip"));
            ZipEntry entry = zipStream.getNextEntry();
            System.out.println(entry.getName); //Yes, I got its name, now I need to get content
        } catch (Exception ex) {
            ex.printStackTrace();
        } finally {
            session.disconnect();
            channelSftp.disconnect();
            channel.disconnect();
        }


    }
}

java zip inputstream

Tony 26.05.2014 источник

comment

Мне действительно нужно писать новый zip-файл, если мне нужно только прочитать содержимое его txt-файла? - Tony 26.05.2014

comment

Нет причин, по которым это не должно работать, вам просто нужно получить все ZIPEntries и сохранить их из потока. - Kenneth Clark 26.05.2014

Ответы (7)

arrow_upward
35
arrow_downward

Ниже приведен простой пример того, как извлечь ZIP-файл, вам нужно будет проверить, является ли файл каталогом. Но это самое простое.

Шаг, который вам не хватает, - это чтение входного потока и запись содержимого в буфер, который записывается в выходной поток.

// Expands the zip file passed as argument 1, into the
// directory provided in argument 2
public static void main(String args[]) throws Exception
{
    if(args.length != 2)
    {
        System.err.println("zipreader zipfile outputdir");
        return;
    }

    // create a buffer to improve copy performance later.
    byte[] buffer = new byte[2048];

    // open the zip file stream
    InputStream theFile = new FileInputStream(args[0]);
    ZipInputStream stream = new ZipInputStream(theFile);
    String outdir = args[1];

    try
    {

        // now iterate through each item in the stream. The get next
        // entry call will return a ZipEntry for each file in the
        // stream
        ZipEntry entry;
        while((entry = stream.getNextEntry())!=null)
        {
            String s = String.format("Entry: %s len %d added %TD",
                            entry.getName(), entry.getSize(),
                            new Date(entry.getTime()));
            System.out.println(s);

            // Once we get the entry from the stream, the stream is
            // positioned read to read the raw data, and we keep
            // reading until read returns 0 or less.
            String outpath = outdir + "/" + entry.getName();
            FileOutputStream output = null;
            try
            {
                output = new FileOutputStream(outpath);
                int len = 0;
                while ((len = stream.read(buffer)) > 0)
                {
                    output.write(buffer, 0, len);
                }
            }
            finally
            {
                // we must always close the output file
                if(output!=null) output.close();
            }
        }
    }
    finally
    {
        // we must always close the zip file.
        stream.close();
    }
}

Фрагмент кода взят со следующего сайта:

http://www.thecoderscorner.com/team-blog/java-and-jvm/12-reading-a-zip-file-from-java-using-zipinputstream#.U4RAxYamixR

Kenneth Clark 26.05.2014

arrow_upward
26
arrow_downward

Ну, я сделал это:

 zipStream = new ZipInputStream(channelSftp.get("Port_Increment_201405261400_2251.zip"));
 zipStream.getNextEntry();

 sc = new Scanner(zipStream);
 while (sc.hasNextLine()) {
     System.out.println(sc.nextLine());
 }

Это помогает мне читать содержимое ZIP без записи в другой файл.

Tony 26.05.2014

comment

Очевидно, что содержимое файла все еще загружается. Вам просто не нужно записывать его во (временный) файл. - Martin Prikryl; 27.05.2014

comment

Я думаю, что решение @KennethClark лучше. Он работает как для текстовых, так и для двоичных файлов, а ваш работает только для текстовых файлов, имхо. Обратите внимание, что хотя он сохраняет извлеченное содержимое в файл, это всего лишь пример того, как скопировать содержимое в другой поток. Это не обязательно должен быть файловый поток, он также может быть потоком памяти или вообще не должен быть потоком. - Martin Prikryl; 27.05.2014

comment

Кстати. Размер текстового файла внутри архива составляет около 1 МБ (111589 строк текста). А чтение (оператор while (sc.hasNextLine()) без sysout) занимает 38 секунд. Это нормально? - Tony; 27.05.2014

comment

Попробуйте решение @KennethClark. Я могу себе представить, что Scanner может быть медленным. - Martin Prikryl; 27.05.2014

arrow_upward
16
arrow_downward

ZipInputStream сам по себе является InputStream и доставляет содержимое каждой записи после каждого вызова getNextEntry(). Необходимо соблюдать особую осторожность, чтобы не закрыть поток, из которого считывается содержимое, поскольку он совпадает с потоком ZIP:

public void readZipStream(InputStream in) throws IOException {
    ZipInputStream zipIn = new ZipInputStream(in);
    ZipEntry entry;
    while ((entry = zipIn.getNextEntry()) != null) {
        System.out.println(entry.getName());
        readContents(zipIn);
        zipIn.closeEntry();
    }
}

private void readContents(InputStream contentsIn) throws IOException {
    byte contents[] = new byte[4096];
    int direct;
    while ((direct = contentsIn.read(contents, 0, contents.length)) >= 0) {
        System.out.println("Read " + direct + "bytes content.");
    }
}

При делегировании чтения содержимого другой логике может потребоваться обернуть ZipInputStream FilterInputStream, чтобы закрыть только запись, а не весь поток, как в:

public void readZipStream(InputStream in) throws IOException {
    ZipInputStream zipIn = new ZipInputStream(in);
    ZipEntry entry;
    while ((entry = zipIn.getNextEntry()) != null) {
        System.out.println(entry.getName());

        readContents(new FilterInputStream(zipIn) {
            @Override
            public void close() throws IOException {
                zipIn.closeEntry();
            }
        });
    }
}

haui 01.06.2018

comment

Оболочка FilterInputStream особенно полезна. - Ng Zhong Qin; 05.10.2020

arrow_upward
3
arrow_downward

ОП был рядом. Просто нужно прочитать байты. Вызов getNextEntry positions the stream at the beginning of the entry data (документы). Если это запись, которую мы хотим (или единственная запись), тогда InputStream находится в правильном месте. Все, что нам нужно сделать, это прочитать распакованные байты этой записи.

byte[] bytes = new byte[(int) entry.getSize()];
int i = 0;
while (i < bytes.length) {
    // .read doesn't always fill the buffer we give it.
    // Keep calling it until we get all the bytes for this entry.
    i += zipStream.read(bytes, i, bytes.length - i);
}

Итак, если эти байты действительно являются текстом, мы можем декодировать эти байты в строку. Я просто предполагаю кодировку utf8.

new String(bytes, "utf8")

Примечание: я лично использую apache commons-io IOUtils, чтобы сократить подобные низкоуровневые вещи. Документы для ZipInputStream.read, похоже, подразумевают, что чтение остановится в конце текущей записи zip. Если это так, то чтение текущей текстовой записи занимает одну строку с IOUtils.

String text = IOUtils.toString(zipStream)

Jason Dunkelberger 05.06.2019

arrow_upward
0
arrow_downward

Вот более общее решение для обработки входного потока zip с помощью BiConsumer. Это почти то же решение, которое использовалось Haui.

private void readZip(InputStream is, BiConsumer<ZipEntry,InputStream> consumer) throws IOException {
    try (ZipInputStream zipFile = new ZipInputStream(is);) {
        ZipEntry entry;
        while((entry = zipFile.getNextEntry()) != null){
            consumer.accept(entry, new FilterInputStream(zipFile) {
                @Override
                public void close() throws IOException {
                    zipFile.closeEntry();
                }
            });
        }
    }
}

Вы можете использовать его, просто позвонив

readZip(<some inputstream>, (entry, is) -> {
    /* don't forget to close this stream after processing. */
    is.read() // ... <- to read each entry
});

ThomasCh 19.08.2019

arrow_upward
0
arrow_downward

Распаковать архив (zip) с сохранением файловой структуры в указанную директорию. Примечание; этот код использует deps на «org.apache.commons.io.IOUtils»), но вы можете заменить его своим собственным кодом «потока чтения»

public static void unzipDirectory(File archiveFile, File destinationDir) throws IOException
{
  Path destPath = destinationDir.toPath();
  try (ZipInputStream zis = new ZipInputStream(new FileInputStream(archiveFile)))
  {
    ZipEntry zipEntry;
    while ((zipEntry = zis.getNextEntry()) != null)
    {
      Path resolvedPath = destPath.resolve(zipEntry.getName()).normalize();
      if (!resolvedPath.startsWith(destPath))
      {
        throw new IOException("The requested zip-entry '" + zipEntry.getName() + "' does not belong to the requested destination");
      }
      if (zipEntry.isDirectory())
      {
        Files.createDirectories(resolvedPath);
      } else
      {
        if(!Files.isDirectory(resolvedPath.getParent()))
        {
          Files.createDirectories(resolvedPath.getParent());
        }
        try (FileOutputStream outStream = new FileOutputStream(resolvedPath.toFile()))
        {
          IOUtils.copy(zis, outStream);
        }
      }
    }
  }
}

T.KH 24.04.2020

arrow_upward
0
arrow_downward

Если содержимое вашего ZIP-файла состоит из 1 файла (например, заархивированное содержимое ответа HTTP), вы можете прочитать текстовое содержимое с помощью Kotlin следующим образом:

@Throws(IOException::class)
fun InputStream.readZippedContent() = ZipInputStream(this).use { stream ->
     stream.nextEntry?.let { stream.bufferedReader().readText() } ?: String()
}

Эта функция расширения распаковывает первую ZIP-запись Zip-файла и читает содержимое как обычный текст.

Использование:

val inputStream: InputStream = ... // your zipped InputStream
val textContent = inputStream.readZippedContent()

mtwain 13.06.2020

Как прочитать файл из ZIP с помощью InputStream?

Ответы (7)

Вопросы по теме