字数 17554

1 引文

本篇主要内容:

  • 在什么场景下需要代理

  • java中如何实现代理,有几种方法,含网络产品示例

2 什么场景下需要代理

网络环境是多样性的,比如我在[这里](/posts/tiaoban_https/#2-2-%E5%86%85%E7%BD%91%E4%BA%91%E4%B8%BB%E6%9C%BA%E5%A6%82%E4%BD%95 %E8%AE%BF%E9%97%AE%E5%A4%96%E7%BD%91)介绍的场景,程序部署在内网云主机中,需要通过代理去访问外部接口就是一种很典型的场景。 这就要求在写API SDK时要提供一个通用的访问API的方案,包括内网走代理的情况之下。

3 我是如何实现java代理的

一般情况,我们在内网主机上要配置正向代理,比如vim /etc/profile

export HTTP_PROXY=http://172.26.3.141:3128/
export HTTPS_PROXY=https://172.26.3.141:3128/

这个是设置环境变量的

这样用curl等命令是可以访问外网了,但是java程序不行,为什么不行,我觉得是你建立TCP套接字时,跟谁去建立。curl读的环境变量里的HTTP_PROXY内容, 所以可以,而java程序,不做特殊处理的情况下自然是直接跟目标网站去建立TCP连接了,可能吗?不能。

一般情况,我们写http请求是这样的,这个就是直连:

URL url = new URL(surl);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();

于是,在上面两行代码的前面,我们可以尝试这样的方式去给java加代理,就可以实现访问了:

 public void addProxy() {
    //另一种方法 https://ask.csdn.net/questions/379970
    // https://yq.aliyun.com/articles/294
    // https://blog.csdn.net/qincidong/article/details/82454427

//            HttpHost proxy = new HttpHost(Application.PROXY_HOST, Application.PROXY_PORT);
//            RequestConfig requestConfig = RequestConfig.custom().setProxy(proxy)
//                    .setSocketTimeout(socketTimeout)
//                    .setConnectTimeout(connectTimeout).build();

    //或需要使用squid代理
    //https://blog.csdn.net/redhat456/article/details/6149774/
    //https://blog.csdn.net/kfanning/article/details/5481552
    //https://stackoverflow.com/questions/1432961/how-do-i-make-httpurlconnection-use-a-proxy
    //https://community.oracle.com/thread/1691437

    System.setProperty("java.net.useSystemProxies","true");
    String hostname=new String();
    String port=new String();

    //https://blog.csdn.net/lirx_tech/article/details/51005281,应该指定为列表?
    //String testUrl="https://172.26.3.141:3128/";

    //https://blog.csdn.net/qq_29951983/article/details/79230671
    //https://www.cnblogs.com/littleatp/p/4729781.html
    MyProxySelector myProxySelector = new MyProxySelector();
    ProxySelector.setDefault(myProxySelector);

    List l = ProxySelector.getDefault().select(null);

    for (Iterator iter = l.iterator(); iter.hasNext(); ) {
        Proxy proxy = (Proxy) iter.next();

        log.debug("!!!proxy hostname : " + proxy.type());

        InetSocketAddress addr = (InetSocketAddress) proxy.address();

        if (addr == null) {
            log.debug("!!!No Proxy");
        } else {
            log.debug("!!!proxy hostname : " + addr.getHostName());
            hostname = addr.getHostName();
            log.debug("!!!proxy port : " + addr.getPort());
            port = String.valueOf(addr.getPort());
        }
    }
    Properties systemProperties = System.getProperties();
    systemProperties.setProperty("http.proxyHost", hostname);
    systemProperties.setProperty("http.proxyPort", port);
    systemProperties.setProperty("https.proxyHost", hostname);
    systemProperties.setProperty("https.proxyPort", port);
}

这个system的property我理解就是虚拟机认的一个东西,加载虚拟机或在程序代码里设定。然后URL默认情况下认这个东东。 这句是我结合下面的实验而想当然的推论,没功夫去深究下去了。

import org.slf4j.LoggerFactory;

import java.io.IOException;
import java.net.*;
import java.util.ArrayList;
import java.util.List;

/**
 * 生产环境用的是跳板机,程序部署在内网主机,需要代理到公网主机上才能访问外部HTTP/HTTPS接口
 * @author tao
 *
 */
public class MyProxySelector extends ProxySelector{
    final static org.slf4j.Logger log = LoggerFactory.getLogger(MyProxySelector.class);

    String PROXY_ADDR = "172.26.3.141";
    int PROXY_PORT = 3128;

    /**
     * Selects all the applicable proxies based on the protocol to
     * access the resource with and a destination address to access
     * the resource at.
     * The format of the URI is defined as follow:
     * <UL>
     * <LI>http URI for http connections</LI>
     * <LI>https URI for https connections
     * <LI>{@code socket://host:port}<br>
     * for tcp client sockets connections</LI>
     * </UL>
     *
     * @param uri The URI that a connection is required to
     * @return a List of Proxies. Each element in the
     * the List is of type
     * {@link Proxy Proxy};
     * when no proxy is available, the list will
     * contain one element of type
     * {@link Proxy Proxy}
     * that represents a direct connection.
     * @throws IllegalArgumentException if the argument is null
     */
    @Override
    public List<Proxy> select(URI uri) {
        List<Proxy> list = new ArrayList<Proxy>();
        list.add(new Proxy(Proxy.Type.HTTP, new InetSocketAddress(PROXY_ADDR, PROXY_PORT)));
        return list;
    }

    /**
     * Called to indicate that a connection could not be established
     * to a proxy/socks server. An implementation of this method can
     * temporarily remove the proxies or reorder the sequence of
     * proxies returned by {@link #select(URI)}, using the address
     * and the IOException caught when trying to connect.
     *
     * @param uri The URI that the proxy at sa failed to serve.
     * @param sa  The socket address of the proxy/SOCKS server
     * @param ioe The I/O exception thrown when the connect failed.
     * @throws IllegalArgumentException if either argument is null
     */
    @Override
    public void connectFailed(URI uri, SocketAddress sa, IOException ioe) {
        log.debug("!!!代理服务器{}:{}连接不上", PROXY_ADDR, PROXY_PORT);
    }
}

这样,参考了一大堆的网页,最后整出一个并不灵活的方案,其实最开始是想灵活化的,但这个MyProxySelector简直就是带两个属性, 一个是Host一个是Port,主要是写死了ip和端口,如果从环境变量读加分析,也差不多了,总之这方案是不完整的,原因有几个 :

  • 什么时候接入proxy?还要外部去写条件,如果这个条件变化了呢?也是跟着变化

  • 每次都接入,还是只接入一次,只接入一次的判定依据是什么

  • 如果代理需要鉴权,如何实现?

尽管如此,在没有鉴权,仅限内部使用,然后根据生产环境或调试环境来决定是否接入代理,然后每次都接入也没什么毛病。这样, 这个方案用用是没有问题的。

4 问题来了

在上文需要正向代理的网络部署环境下,我们使用了阿里云的短信服务,其提供了一个SDK,大概就是基于HTTP请求,封装实现包括签名等各种签权方法,还有endpoint的管理啦。很可惜socket connect timout,摆明就是网络不通了。

然后,我们addProxy,不通!为什么我们的URL连接一加Proxy就OK?直觉告诉我阿里SDK有没有自己代理的设定要求呢,要去了解这一块, 查看文档或SDK什么的。

我先关切一下我起手的亲爱的addProxy,为什么它的影响力不是全局,我想试一下它的范围,于是我需要1个web工程,1个jar包, 在web工程里造3处调用,1处是调本工程内接口加代理addProxy,2处是调本工程内接口不加代理,3处是调外部jar包不加代理。

工程是用springboot main构建的,非常方便,文件树是:

├─main
│  ├─java
│  │  └─de
│  │      └─tao
│  │          └─proxytest
│  │              │  Starter.java
│  │              │
│  │              ├─controller
│  │              │      TestController1.java
│  │              │      TestController2.java
│  │              │      TestController3.java
│  │              │
│  │              └─http
│  │                      HttpUtils1.java
│  │                      HttpUtils2.java
│  │                      MyProxySelector.java
│  │
│  ├─resources
│  │      application-dev.yml
│  │      application-pro.yml
│  │      application.yml
│  │
│  └─webapp
│          index.html

核心配置是:

pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
         http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>de.tao</groupId>
    <artifactId>proxytest</artifactId>
    <version>1.0-SNAPSHOT</version>
    <!--<packaging>war</packaging>-->

    <properties>
        <failOnMissingWebXml>false</failOnMissingWebXml>
        <start-class>de.tao.proxytest.Starter</start-class>
    </properties>


    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.0.0.RELEASE</version>
        <relativePath />
    </parent>
<dependencies>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>

    <dependency>
        <groupId>de.tao</groupId>
        <artifactId>proxytestinvoke</artifactId>
        <version>1.0-SNAPSHOT</version>
    </dependency>

</dependencies>

<build>
    <plugins>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-compiler-plugin</artifactId>
            <configuration>
                <source>1.8</source>
                <target>1.8</target>
            </configuration>
        </plugin>

        <plugin>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-maven-plugin </artifactId>
            <executions>
                <execution>
                    <!--https://blog.csdn.net/ya2dan/article/details/50786464-->
                    <!--打包War命令: mvn compile package -Dmaven.test.skip=true -X -->
                    <goals>
                        <goal>repackage</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>
    </plugins>
</build>

</project>

其它类是:

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
public class Starter extends org.springframework.boot.web.servlet.support.SpringBootServletInitializer{
    public static void main(String[] args) {
        SpringApplication.run(Starter.class, args);
    }
}
import de.tao.proxytest.http.HttpUtils1;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.http.HttpMethod;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

@RestController
public class TestController1 {
    @Autowired private HttpUtils1 utils1;
    @GetMapping(value="/url1")
    public String access1() {
        String s = utils1.req(HttpMethod.GET, "https://www.baidu.com", null);
        String result = "返回的内容为:";
        if (s != null && s.length() > 0) {
            result += s.replace("<!","");
        }
        return result;
    }
}
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Component;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.net.*;
import java.util.Iterator;
import java.util.List;
import java.util.Properties;

@Component
public class HttpUtils1 {
    final private org.slf4j.Logger log = LoggerFactory.getLogger(this.getClass());

    @Value("${spring.profiles.active}")
    private String profile;

    public String req(org.springframework.http.HttpMethod method, String surl, String reqContent) {
        String result = "";
        BufferedReader reader = null;

        try {
            //if ("pro".equals(profile)) {
                addProxy();
            //}
            URL url = new URL(surl);
            HttpURLConnection conn = (HttpURLConnection) url.openConnection();

            conn.setRequestMethod(method.name());
            conn.setConnectTimeout(1000);
            //放在正文里
            conn.setDoOutput(true);
            conn.setDoInput(true);
            //POST不能使用缓存
            conn.setUseCaches(false);
            conn.setRequestProperty("Connection", "Keep-Alive");
            conn.setRequestProperty("Charset", "UTF-8");

            // 设置文件类型:
            //conn.setRequestProperty("Content-Type","multipart/form-data");
            //conn.addRequestProperty("Content-Type", "multipart/form-data; boundary=--taoych");

            // 设置接收类型否则返回415错误
            conn.setRequestProperty("accept","*/*");//此处为暴力方法设置接受所有类型,以此来防范返回415;
            //conn.setRequestProperty("accept","application/json");

            if (reqContent != null) {
                // 往服务器里面发送数据
                byte[] writebytes = reqContent.getBytes();
                // 设置文件长度
                conn.setRequestProperty("Content-Length", String.valueOf(writebytes.length));

                OutputStream outwritestream = conn.getOutputStream();
                outwritestream.write(writebytes);
                outwritestream.flush();
                outwritestream.close();
            }
            conn.connect();
            int code = conn.getResponseCode();

            if (code == 200) {
                reader = new BufferedReader(
                        new InputStreamReader(conn.getInputStream()));
                result = reader.readLine();
            }
            log.debug("!req done, code: {}, result: {}", code, result);
        } catch (Exception e) {
            e.printStackTrace();
            return null;
        } finally {
            if (reader != null) {
                try {
                    reader.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
        }
        return result;
    }
}

外部jar包工程更简单,其文件结构如下:

├─main
│  ├─java
│  │  └─com
│  │      └─aliyun
│  │              HttpUtils2.java

配置如下:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>de.tao</groupId>
    <artifactId>proxytestinvoke</artifactId>
    <version>1.0-SNAPSHOT</version>


</project>

只有一个类,内容跟HttpUtils1是一样的。

将web jar包和java jar拷贝到内网云主机,运行web jar包, java -jar xx.jar,在公网云主机上运行以下curl命令,结果如下:

curl http://172.26.3.142:8082/proxytest-1.0-SNAPSHOT/url2
返回的内容为:

curl http://172.26.3.142:8082/proxytest-1.0-SNAPSHOT/url1
返回的内容为:DOCTYPE html>

curl http://172.26.3.142:8082/proxytest-1.0-SNAPSHOT/url2
返回的内容为:DOCTYPE html>

curl http://172.26.3.142:8082/proxytest-1.0-SNAPSHOT/url3
返回的内容为:DOCTYPE html>

可见,system properties的影响是全局的,包括外部jar包,其实他们都在一个虚拟机进程内,所以才有前面的想当然的推论。那么,为什么 阿里的SDK对这个全局影响视而不见呢?这只能去看阿里的SDK了。

5 阿里的SDK的代理写法

阿里sdk的调用就这几行代码:

IClientProfile profile = DefaultProfile.getProfile("cn-hangzhou", accessKeyId, accessKeySecret);
DefaultProfile.addEndpoint("cn-hangzhou", product, domain);
IAcsClient acsClient = new DefaultAcsClient(profile);

根据错误异常:
exception

以下代码都是aliyun-java-sdk-core 4.2.3这个包内的,就是这个方法里com.aliyuncs.http.clients .CompatibleUrlConnClient#syncInvoke出现了异常,它的代码是这样的:

@Override
public HttpResponse syncInvoke(HttpRequest request) throws IOException {
    OutputStream out = null;
    InputStream content = null;
    HttpResponse response = null;
    HttpURLConnection httpConn = buildHttpConnection(request);

    try {
        httpConn.connect();
        if (null != request.getHttpContent() && request.getHttpContent().length > 0) {
            out = httpConn.getOutputStream();
            out.write(request.getHttpContent());
        }
        content = httpConn.getInputStream();
        response = new HttpResponse(httpConn.getURL().toString());
        parseHttpConn(response, httpConn, content);
        return response;
    } catch (IOException e) {
        content = httpConn.getErrorStream();
        response = new HttpResponse(httpConn.getURL().toString());
        parseHttpConn(response, httpConn, content);
        return response;
    } finally {
        if (content != null) { content.close(); }
        httpConn.disconnect();
    }
}

看到buildHttpConnection基本上就差不多了,进去一看果然,这些代码就是问题的关键了:

HttpURLConnection httpConn = null;
if (url.getProtocol().equalsIgnoreCase("https")) {
  if (sslSocketFactory != null) {
      Proxy proxy = getProxy("HTTPS_PROXY", request);
      HttpsURLConnection httpsConn = (HttpsURLConnection)url.openConnection(proxy);
      httpsConn.setSSLSocketFactory(sslSocketFactory);
      httpConn = httpsConn;
  }
}

if (httpConn == null) {
  Proxy proxy = getProxy("HTTP_PROXY", request);
  httpConn = (HttpURLConnection)url.openConnection(proxy);
}

最直接关联的代码就是下面这个:

private Proxy getProxy(String env, HttpRequest request) throws MalformedURLException, UnsupportedEncodingException {
    Proxy proxy = Proxy.NO_PROXY;
    String httpProxy = System.getenv(env);
    if (httpProxy != null) {
        URL proxyUrl = new URL(httpProxy);
        String userInfo = proxyUrl.getUserInfo();
        if (userInfo != null) {
            byte[] bytes = userInfo.getBytes("UTF-8");
            String auth = DatatypeConverter.printBase64Binary(bytes);
            request.putHeaderParameter("Proxy-Authorization", "Basic " + auth);
        }
        String hostname = proxyUrl.getHost();
        int port = proxyUrl.getPort();
        if (port == -1) {
            port = proxyUrl.getDefaultPort();
        }
        SocketAddress addr = new InetSocketAddress(hostname, port);
        proxy = new Proxy(Proxy.Type.HTTP, addr);
    }
    return proxy;
}

可见,它使用的是环境变量HTTP_PROXYHTTPS_PROXY的值,注意大小写。而且它解决了授权的问题,这是代理的标准搞法啊。

6 再回过头来解决问题

那试一下吧,亲爱的addProxy工程里读环境变量HTTP_PROXY的值会是啥?明明我在/etc/profile里设置了http_proxy,看出来了,是小写!

加上这段代码就能看到结果了:

String httpProxy = System.getenv("HTTP_PROXY");
System.out.println("system proxy env: " + httpProxy);
httpProxy = System.getenv("http_proxy");
System.out.println("system proxy lowercase env: " + httpProxy);

结果是:

system proxy env: null
system proxy lowercase env: http://172.26.3.141:3128/

果然,读不到环境变量值,那么我设置一下吧,把两个大写的修改到profile文件,并且source命令加载,当前xshell会窗口已经有了,再跑测试程序打印出值了。

然后,重启tomcat程序,发现还是不行!再把小写删除,不行!什么鬼,在启tomcat的xshell会话窗口里用env命令看看,怎么env里没有大写的HTTP_PROXY设置呢?source一下就有了。这样看来,xshell 会话窗口的source并不会传导到其它会话窗口。

就这样,增加一个环境变量即问题解决了,问题的测试定位花费了主要的时间。

7 或许我们的代理方式可以改进了

待续